Understanding Optimizer-Statistics-for-Developers

953 views

Published on

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
953
On SlideShare
0
From Embeds
0
Number of Embeds
71
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Understanding Optimizer-Statistics-for-Developers

  1. 1. Understanding  Op.mizer   Sta.s.cs  for  Developers  May  the  Odds  Be  Ever  in  Your  Favor  
  2. 2. ©2013 Enkitec 2
  3. 3. www.enkitec.com   www.facebook.com/enkitec   @enkitec  ©2013 Enkitec 3
  4. 4. karen.morton@enkitec.com   karenmorton.blogspot.com   @karen_morton  ©2013 Enkitec 4
  5. 5. Statistics?   Odds?   Its just math.  
  6. 6. SQL>desc tributes Name Type ----------- ------------- PANEM_ID NUMBER NAME VARCHAR2(20) DISTRICT NUMBER GENDER VARCHAR2(1) AGE NUMBER QUAL_SCORE NUMBER WEAPON VARCHAR2(60)©2013 Enkitec 6
  7. 7. ©2013 Enkitec 7
  8. 8. Cardinality   The  es.mated  number  of  rows   a  query  is  expected  to  return.   number of rows in table x predicate selectivity©2013 Enkitec 8
  9. 9. select * from tributes order by district, age; Cardinality 24 x 1 = 24©2013 Enkitec 9
  10. 10. ©2013 Enkitec 10
  11. 11. select * from tributes where gender = F ; Cardinality 24 x 1/2 = 12©2013 Enkitec 11
  12. 12. ©2013 Enkitec 12
  13. 13. select * from tributes where district = 12 and gender = F ; Cardinality 24 x 1/12 x 1/2 = 1©2013 Enkitec 13
  14. 14. ©2013 Enkitec 14
  15. 15. select * from tributes where age > 15 ; Cardinality (17 – 15) High  Value  -­‐  Predicate  Value   24 x (17 – 12) High  Value  -­‐  Low  Value  ©2013 Enkitec = 9.6 15
  16. 16. ©2013 Enkitec 16
  17. 17. select * from tributes where weapon = :b1 ; Cardinality 24 x 1/11 = 2.18©2013 Enkitec 17
  18. 18. :b1 = Bow and Arrow©2013 Enkitec 18
  19. 19. No tricks  just   math
  20. 20. But…  
  21. 21. Sta.s.cs  that  dont  reasonably   describe  your  data  
  22. 22. …lead  to  poor    cardinality  es.mates  
  23. 23. …which  leads  to  poor    access  path  selec.on  
  24. 24. …which  leads  to  poor    join  method  selec.on  
  25. 25. …which  leads  to  poor     join  order  selec.on  
  26. 26. …which  leads  to  poor    SQL  execu.on  .mes.  
  27. 27. Sta.s.cs  maTer!  
  28. 28. Goal     Collect  sta.s.cs  that  are   "good  enough"  to  meet  most  needs  most  of  the  .me.    
  29. 29. Say  you  were  standing  with  one   foot  in  the  oven  and  one  foot  in   an  ice  bucket.  According  to  the   percentage  people,  you  would   be  perfectly  comfortable.              –  Bobby  Bragan    
  30. 30. Data  Skew  
  31. 31. The  op.mizer  assumes   uniform  distribu.on   of  column  values.  
  32. 32. Color  column  -­‐  uniform  distribu.on  
  33. 33. Color  column  –  skewed  distribu.on  
  34. 34. Data  skew  must  be   iden.fied  with   a  histogram.  
  35. 35. Table: obj_tab 100%  Sta.s.cs  Statistic Current value FOR  ALL  COLUMNS  SIZE  1  --------------- --------------# rows 1601874Blocks 22321Avg Row Len 94Sample Size 1601874Monitoring YESColumn: object_typeOBJECT_TYPE PCT_TOTAL------------------------------- ---------WINDOW GROUP - PROGRAM .00-.02EVALUATION CONTEXT - XML SCHEMA .03-.05OPERATOR - PROCEDURE .11-.17LIBRARY - TYPE BODY .30-.35FUNCTION - INDEX PARTITION .54-.64JAVA RESOURCE - PACKAGE 1.54-1.69TABLE - VIEW 3.44-7.35JAVA CLASS 32.80SYNONYM 40.01
  36. 36. PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 16yy3p8sstr28, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = PROCEDUREPlan hash value: 2862749165--------------------------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |--------------------------------------------------------------------------------| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 2720 | 1237 ||* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 2720 | 193 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OBJECT_TYPE"=PROCEDURE) R  =  .06  seconds  
  37. 37. PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 9u6ppkh5mhr8v, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = SYNONYMPlan hash value: 2862749165--------------------------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |--------------------------------------------------------------------------------| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 640K| 104K||* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 640K| 44082 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OBJECT_TYPE"=SYNONYM) R  =  14.25  seconds  
  38. 38. 100%  Sta.s.cs   FOR  ALL  COLUMNS  SIZE  AUTO  PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 16yy3p8sstr28, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = PROCEDUREPlan hash value: 2862749165--------------------------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |--------------------------------------------------------------------------------| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 2720 | 2720 | 1237 ||* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 2720 | 2720 | 193 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OBJECT_TYPE"=PROCEDURE) R  =  .07  seconds  
  39. 39. PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 9u6ppkh5mhr8v, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = SYNONYMPlan hash value: 2748991475-----------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |-----------------------------------------------------------------|* 1 | TABLE ACCESS FULL| OBJ_TAB | 640K| 640K| 64263 |-----------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter("OBJECT_TYPE"=SYNONYM) R  =  3.36  seconds  
  40. 40. Dynamic  Sampling  
  41. 41. op.mizer_dynamic_sampling   parameter     dynamic_sampling   hint  
  42. 42. SQL>create table ds_test as 2 select mod(num, 100) c1, 3 mod(num, 100) c2, 4 mod(num, 75) c3, 5 mod(num, 30) c4 6 from (select level num from dual 7 connect by level <= 10001); Table created.SQL>exec dbms_stats.gather_table_stats( user, ds_test,estimate_percent => null, method_opt => for all columns size1); PL/SQL procedure successfully completed.
  43. 43. Statistic Current value--------------- --------------# rows 10001Blocks 28Avg Row Len 11Sample Size 10001Monitoring YESColumn NDV Density AvgLen Histogram LowVal HighVal------- --- ------- ------ --------- ------ -------C1 100 .010000 3 NONE (1) 0 99C2 100 .010000 3 NONE (1) 0 99C3 75 .013333 3 NONE (1) 0 74C4 30 .033333 3 NONE (1) 0 29
  44. 44. SQL>set autotrace traceonly explainSQL>select count(*) from ds_test where c1 = 10; Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 3 | 8 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 3 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 100 | 300 | 8 (0)| 00:00:01 |---------------------------------------------------------------------------------- Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10)
  45. 45. SQL>set autotrace traceonly explainSQL>select count(*) from ds_test where c1 = 10 and c2 = 10; Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 6 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 1 | 6 | 8 (0)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10 AND "C2"=10)
  46. 46. SQL>set autotrace traceonly explainSQL>select /*+ dynamic_sampling (4) */ count(*) 2 from ds_test where c1 = 10 and c2 = 10; Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 6 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 100 | 600 | 8 (0)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10 AND "C2"=10)Note----- - dynamic sampling used for this statement
  47. 47. Extended  Sta.s.cs  
  48. 48. SQL> select dbms_stats.create_extended_stats(ownname=>user, 2 tabname => DS_TEST, 3 extension => (c1, c2) ) AS c1_c2_correlation 4 from dual ;C1_C2_CORRELATION-------------------------------------------------------------SYS_STUF3GLKIOP5F4B0BTTCFTMX0WSQL> exec dbms_stats.gather_table_stats( user, ds_test);PL/SQL procedure successfully completed.
  49. 49. SQL> set autotrace traceonly explainSQL> select count(*) from ds_test where c1 = 10 and c2 = 10;Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 6 | 9 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 6 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 100 | 600 | 9 (0)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10 AND "C2"=10)
  50. 50. Understanding  how  the   op.mizer  uses  sta.s.cs   helps  you  know  when   "special"  sta.s.cs   can  help.  ©2013 Enkitec 50
  51. 51. Query  Transforma.on  ©2013 Enkitec 51
  52. 52. SQL  and  the  Op.mizer   Parsed  SQL   iterative Query transformations Logical   op.miza.ons   Access & Join methods Join order Physical   stats Dic.onary   op.miza.ons   Cost  Es.mator   Execution Plan©2013 Enkitec 52
  53. 53. Query  Transforma.ons  •  Goal  is  to  enhance  query  performance  by   making  more  choices  available  to  the  op6mizer  •  Creates  a  seman.cally  equivalent  statement   that  will  produce  the  same  results    •  Remove  extraneous  condi.ons  •  Add  inferred  condi.ons   Think  algebra…  ©2013 Enkitec 53
  54. 54. Types  of  Transforma.ons  •  Automa.c   –  Always  produce  a  faster  plan  •  Heuris.c-­‐based  (pre-­‐10gR1)   –  Should  produce  a  faster  plan  most  of  the  .me  •  Cost-­‐based   –  Does  not  always  produce  a  faster  plan  ©2013 Enkitec 54
  55. 55. So,  if  the  op.mizer  will  transform  my  SQL  for  me,   why  should  I  write  it   any  differently?  
  56. 56. Why  You  Should  Refactor  •  You  know  your  stuff  best  (or  you  should)  •  Always  filter  early  •  Defines  your  expecta.ons  •  K.I.S.S.  •  The  op.mizer  might  not  be  able  to  ©2013 Enkitec 56
  57. 57. Common  Transforma.ons  •  FPD  –  filter  push-­‐down  •  PM  –  predicate  move-­‐around  •  SU  –  subquery  unnes.ng  •  CVM  –  complex  view  merging  •  SPJ  –  select-­‐project-­‐join  (simple  view  merging)  •  JF  –  join  factoriza.on  ©2013 Enkitec Numerous additional transformations exist – see 10053 trace 57
  58. 58. SPJ  –  Simple  View  Merging   is transformed into Merged automatically as it is deemed “always better” for the optimizer to work with direct joins. 58
  59. 59. Complex  View  Merging   is transformed into“Complex” due to GROUP BY. CVM can also be done when using DISTINCT or outer join. 59
  60. 60. Filter  Push-­‐Down   is transformed intoPurpose: To push outer query predicates into view to perform earlier filtering. 60
  61. 61. Predicate  Move-­‐Around   is transformed into Purpose: To move inexpensive predicates into view query blocks to perform earlier filtering. 61 Can generate filter predicates based on transitivity or functional dependencies.
  62. 62. Join  Factoriza.on   is transformed intoCombines branches of UNION / UNION ALL that join a common table in order to reduce # of accesses to that table. 62
  63. 63. Understanding  how  the   op.mizer  transforms  queries   helps  you  write  beTer  SQL  and   understand  execu.on  plans.  ©2013 Enkitec 63
  64. 64. Cardinality  Feedback  ©2013 Enkitec 64
  65. 65. What  is  Cardinality  Feedback?  •  Automa.cally  improve  plans  for  repeatedly   executed  queries  where  the  op.mizer  does   not  es.mate  cardinali.es  properly  •  Mises.mates  may  be  due  to   –  Missing  or  inaccurate  sta.s.cs   –  Complex  predicates        …and  more  •  Es.mates  and  actuals  are  compared  •  New  plan  generated  using  adjustment  factors  •  Es.mates  lost  if  plan  ages  out  of  cache    ©2013 Enkitec 65
  66. 66. ©2013 Enkitec 66
  67. 67. Original  plan  ©2013 Enkitec 67
  68. 68. New  plan  ©2013 Enkitec 68
  69. 69. Es.mates  Differences  Plan  Opera.on   Object   Rows  es.mate  Original  Plan   23835  TABLE  ACCESS  FULL   VS_PRODUCT   67  INDEX  FAST  FULL  SCAN   IDX_VS_ORDER_027   514K  INDEX  FAST  FULL  SCAN   IDX_VS_ORDER_TRACK_017   2665K  INDEX  FAST  FULL  SCAN   PK_VS_ORDER   8673K  New  Plan   8371   ✔INDEX  FULL  SCAN   IDX_VS_PRODUCT_002   84  INDEX  SKIP  SCAN   IDX_VS_ORDER_027   941K  INDEX  SKIP  SCAN   IDX_VS_ORDER_TRACK_017   2899K  INDEX  FAST  FULL  SCAN   PK_VS_ORDER   10M  ©2013 Enkitec 69
  70. 70. Cardinality  feedback  is  an  automated   way  the  op.mizer  tries  to   "self-­‐correct"  and  produce  plans   that  perform  op.mally.  ©2013 Enkitec 70
  71. 71. Whats  ahead   in  12c?  ©2013 Enkitec 71
  72. 72. Dynamic  Sta.s.cs  •  Replaces  dynamic  sampling  •  Compensates  for  missing  or  incomplete  stats  •  Time  to  do  collec.ons  is  limited  by  op.mizer   based  on  es.mated  query  run  .me  •  When  collected,  stats  will  be  stored  in  cache   and  therefore  can  be  shared  among  queries  •  Different  queries  using  same  predicates  can   used  dynamic  stats  stored  in  cache  ©2013 Enkitec 72
  73. 73. Adap.ve  Sta.s.cs  •  Was  cardinality  feedback  in  11gR2  •  Es.mates  compared  to  actuals  •  Significant  varia.ons  cause  new  plan  choice  •  New  plan  uses  stats  from  previous  execu.ons  •  Previous  stats  retained  for  later  use  as  SQL   Plan  Direc.ves  •  Statements  can  be  re-­‐op.mized  over  and  over   –  New  column  V$SQL.IS_REOPTIMIZABLE©2013 Enkitec 73
  74. 74. Adap.ve  Execu.on  Plans  •  More  than  1  plan  pre-­‐computed  and  stored  •  Sta.s.cs  collectors  used  to  capture  execu.on   info  •  If  NL  chosen  and  exceeds  certain  threshold,   will  switch  to  HJ  •  DBMS_XPLAN.DISPLAY_CURSOR  has  new   format  parameter  to  show  ac.ons     format=>+all_dyn_plan +adaptive©2013 Enkitec 74
  75. 75. Conclusion   75
  76. 76. Sta.s.cs  must  reasonably  represent  your  actual  data.   76
  77. 77. Understanding  basic   op.mizer  sta.s.cs  computa.ons  is  key.   77
  78. 78. Write  SQL  to  u.lize  your  knowledge   of  how  sta.s.cs  are  used  by  the  op.mizer.   78
  79. 79. The  more  you  know,   the  more  likely  your     sta.s.cs  strategy   and  the  way  you  write  SQL   will  support  op.mally  performing  execu.on  plans.   79
  80. 80. Thank  You!  
  81. 81. Q & Q U E S T I O N S A N S W E R S

×