Upcoming SlideShare
×

# Understanding Optimizer-Statistics-for-Developers

777
-1

Published on

3 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
777
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
0
0
Likes
3
Embeds 0
No embeds

No notes for slide

### Understanding Optimizer-Statistics-for-Developers

1. 1. Understanding  Op.mizer   Sta.s.cs  for  Developers  May  the  Odds  Be  Ever  in  Your  Favor
4. 4. karen.morton@enkitec.com   karenmorton.blogspot.com   @karen_morton  ©2013 Enkitec 4
5. 5. Statistics?   Odds?   Its just math.
6. 6. SQL>desc tributes Name Type ----------- ------------- PANEM_ID NUMBER NAME VARCHAR2(20) DISTRICT NUMBER GENDER VARCHAR2(1) AGE NUMBER QUAL_SCORE NUMBER WEAPON VARCHAR2(60)©2013 Enkitec 6
8. 8. Cardinality   The  es.mated  number  of  rows   a  query  is  expected  to  return.   number of rows in table x predicate selectivity©2013 Enkitec 8
9. 9. select * from tributes order by district, age; Cardinality 24 x 1 = 24©2013 Enkitec 9
11. 11. select * from tributes where gender = F ; Cardinality 24 x 1/2 = 12©2013 Enkitec 11
13. 13. select * from tributes where district = 12 and gender = F ; Cardinality 24 x 1/12 x 1/2 = 1©2013 Enkitec 13
15. 15. select * from tributes where age > 15 ; Cardinality (17 – 15) High  Value  -­‐  Predicate  Value   24 x (17 – 12) High  Value  -­‐  Low  Value  ©2013 Enkitec = 9.6 15
17. 17. select * from tributes where weapon = :b1 ; Cardinality 24 x 1/11 = 2.18©2013 Enkitec 17
18. 18. :b1 = Bow and Arrow©2013 Enkitec 18
19. 19. No tricks  just   math
20. 20. But…
21. 21. Sta.s.cs  that  dont  reasonably   describe  your  data
22. 22. …lead  to  poor    cardinality  es.mates
23. 23. …which  leads  to  poor    access  path  selec.on
24. 24. …which  leads  to  poor    join  method  selec.on
25. 25. …which  leads  to  poor     join  order  selec.on
26. 26. …which  leads  to  poor    SQL  execu.on  .mes.
27. 27. Sta.s.cs  maTer!
28. 28. Goal     Collect  sta.s.cs  that  are   "good  enough"  to  meet  most  needs  most  of  the  .me.
29. 29. Say  you  were  standing  with  one   foot  in  the  oven  and  one  foot  in   an  ice  bucket.  According  to  the   percentage  people,  you  would   be  perfectly  comfortable.              –  Bobby  Bragan
30. 30. Data  Skew
31. 31. The  op.mizer  assumes   uniform  distribu.on   of  column  values.
32. 32. Color  column  -­‐  uniform  distribu.on
33. 33. Color  column  –  skewed  distribu.on
34. 34. Data  skew  must  be   iden.ﬁed  with   a  histogram.
35. 35. Table: obj_tab 100%  Sta.s.cs  Statistic Current value FOR  ALL  COLUMNS  SIZE  1  --------------- --------------# rows 1601874Blocks 22321Avg Row Len 94Sample Size 1601874Monitoring YESColumn: object_typeOBJECT_TYPE PCT_TOTAL------------------------------- ---------WINDOW GROUP - PROGRAM .00-.02EVALUATION CONTEXT - XML SCHEMA .03-.05OPERATOR - PROCEDURE .11-.17LIBRARY - TYPE BODY .30-.35FUNCTION - INDEX PARTITION .54-.64JAVA RESOURCE - PACKAGE 1.54-1.69TABLE - VIEW 3.44-7.35JAVA CLASS 32.80SYNONYM 40.01
36. 36. PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 16yy3p8sstr28, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = PROCEDUREPlan hash value: 2862749165--------------------------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |--------------------------------------------------------------------------------| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 2720 | 1237 ||* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 2720 | 193 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OBJECT_TYPE"=PROCEDURE) R  =  .06  seconds
37. 37. PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 9u6ppkh5mhr8v, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = SYNONYMPlan hash value: 2862749165--------------------------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |--------------------------------------------------------------------------------| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 640K| 104K||* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 640K| 44082 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OBJECT_TYPE"=SYNONYM) R  =  14.25  seconds
38. 38. 100%  Sta.s.cs   FOR  ALL  COLUMNS  SIZE  AUTO  PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 16yy3p8sstr28, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = PROCEDUREPlan hash value: 2862749165--------------------------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |--------------------------------------------------------------------------------| 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 2720 | 2720 | 1237 ||* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 2720 | 2720 | 193 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OBJECT_TYPE"=PROCEDURE) R  =  .07  seconds
39. 39. PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------SQL_ID 9u6ppkh5mhr8v, child number 0-------------------------------------select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id,status from obj_tab where object_type = SYNONYMPlan hash value: 2748991475-----------------------------------------------------------------| Id | Operation | Name | E-Rows | A-Rows | Buffers |-----------------------------------------------------------------|* 1 | TABLE ACCESS FULL| OBJ_TAB | 640K| 640K| 64263 |-----------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter("OBJECT_TYPE"=SYNONYM) R  =  3.36  seconds
40. 40. Dynamic  Sampling
41. 41. op.mizer_dynamic_sampling   parameter     dynamic_sampling   hint
42. 42. SQL>create table ds_test as 2 select mod(num, 100) c1, 3 mod(num, 100) c2, 4 mod(num, 75) c3, 5 mod(num, 30) c4 6 from (select level num from dual 7 connect by level <= 10001); Table created.SQL>exec dbms_stats.gather_table_stats( user, ds_test,estimate_percent => null, method_opt => for all columns size1); PL/SQL procedure successfully completed.
43. 43. Statistic Current value--------------- --------------# rows 10001Blocks 28Avg Row Len 11Sample Size 10001Monitoring YESColumn NDV Density AvgLen Histogram LowVal HighVal------- --- ------- ------ --------- ------ -------C1 100 .010000 3 NONE (1) 0 99C2 100 .010000 3 NONE (1) 0 99C3 75 .013333 3 NONE (1) 0 74C4 30 .033333 3 NONE (1) 0 29
44. 44. SQL>set autotrace traceonly explainSQL>select count(*) from ds_test where c1 = 10; Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 3 | 8 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 3 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 100 | 300 | 8 (0)| 00:00:01 |---------------------------------------------------------------------------------- Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10)
45. 45. SQL>set autotrace traceonly explainSQL>select count(*) from ds_test where c1 = 10 and c2 = 10; Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 6 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 1 | 6 | 8 (0)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10 AND "C2"=10)
46. 46. SQL>set autotrace traceonly explainSQL>select /*+ dynamic_sampling (4) */ count(*) 2 from ds_test where c1 = 10 and c2 = 10; Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 6 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 100 | 600 | 8 (0)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10 AND "C2"=10)Note----- - dynamic sampling used for this statement
47. 47. Extended  Sta.s.cs
48. 48. SQL> select dbms_stats.create_extended_stats(ownname=>user, 2 tabname => DS_TEST, 3 extension => (c1, c2) ) AS c1_c2_correlation 4 from dual ;C1_C2_CORRELATION-------------------------------------------------------------SYS_STUF3GLKIOP5F4B0BTTCFTMX0WSQL> exec dbms_stats.gather_table_stats( user, ds_test);PL/SQL procedure successfully completed.
49. 49. SQL> set autotrace traceonly explainSQL> select count(*) from ds_test where c1 = 10 and c2 = 10;Execution Plan----------------------------------------------------------Plan hash value: 3984367388----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 6 | 9 (0)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 6 | | ||* 2 | TABLE ACCESS FULL| DS_TEST | 100 | 600 | 9 (0)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("C1"=10 AND "C2"=10)
50. 50. Understanding  how  the   op.mizer  uses  sta.s.cs   helps  you  know  when   "special"  sta.s.cs   can  help.  ©2013 Enkitec 50
51. 51. Query  Transforma.on  ©2013 Enkitec 51
52. 52. SQL  and  the  Op.mizer   Parsed  SQL   iterative Query transformations Logical   op.miza.ons   Access & Join methods Join order Physical   stats Dic.onary   op.miza.ons   Cost  Es.mator   Execution Plan©2013 Enkitec 52
53. 53. Query  Transforma.ons  •  Goal  is  to  enhance  query  performance  by   making  more  choices  available  to  the  op6mizer  •  Creates  a  seman.cally  equivalent  statement   that  will  produce  the  same  results    •  Remove  extraneous  condi.ons  •  Add  inferred  condi.ons   Think  algebra…  ©2013 Enkitec 53
54. 54. Types  of  Transforma.ons  •  Automa.c   –  Always  produce  a  faster  plan  •  Heuris.c-­‐based  (pre-­‐10gR1)   –  Should  produce  a  faster  plan  most  of  the  .me  •  Cost-­‐based   –  Does  not  always  produce  a  faster  plan  ©2013 Enkitec 54
55. 55. So,  if  the  op.mizer  will  transform  my  SQL  for  me,   why  should  I  write  it   any  diﬀerently?
56. 56. Why  You  Should  Refactor  •  You  know  your  stuﬀ  best  (or  you  should)  •  Always  ﬁlter  early  •  Deﬁnes  your  expecta.ons  •  K.I.S.S.  •  The  op.mizer  might  not  be  able  to  ©2013 Enkitec 56
57. 57. Common  Transforma.ons  •  FPD  –  ﬁlter  push-­‐down  •  PM  –  predicate  move-­‐around  •  SU  –  subquery  unnes.ng  •  CVM  –  complex  view  merging  •  SPJ  –  select-­‐project-­‐join  (simple  view  merging)  •  JF  –  join  factoriza.on  ©2013 Enkitec Numerous additional transformations exist – see 10053 trace 57
58. 58. SPJ  –  Simple  View  Merging   is transformed into Merged automatically as it is deemed “always better” for the optimizer to work with direct joins. 58
59. 59. Complex  View  Merging   is transformed into“Complex” due to GROUP BY. CVM can also be done when using DISTINCT or outer join. 59
60. 60. Filter  Push-­‐Down   is transformed intoPurpose: To push outer query predicates into view to perform earlier filtering. 60
61. 61. Predicate  Move-­‐Around   is transformed into Purpose: To move inexpensive predicates into view query blocks to perform earlier filtering. 61 Can generate filter predicates based on transitivity or functional dependencies.
62. 62. Join  Factoriza.on   is transformed intoCombines branches of UNION / UNION ALL that join a common table in order to reduce # of accesses to that table. 62
63. 63. Understanding  how  the   op.mizer  transforms  queries   helps  you  write  beTer  SQL  and   understand  execu.on  plans.  ©2013 Enkitec 63
64. 64. Cardinality  Feedback  ©2013 Enkitec 64
65. 65. What  is  Cardinality  Feedback?  •  Automa.cally  improve  plans  for  repeatedly   executed  queries  where  the  op.mizer  does   not  es.mate  cardinali.es  properly  •  Mises.mates  may  be  due  to   –  Missing  or  inaccurate  sta.s.cs   –  Complex  predicates        …and  more  •  Es.mates  and  actuals  are  compared  •  New  plan  generated  using  adjustment  factors  •  Es.mates  lost  if  plan  ages  out  of  cache    ©2013 Enkitec 65
67. 67. Original  plan  ©2013 Enkitec 67
68. 68. New  plan  ©2013 Enkitec 68
69. 69. Es.mates  Diﬀerences  Plan  Opera.on   Object   Rows  es.mate  Original  Plan   23835  TABLE  ACCESS  FULL   VS_PRODUCT   67  INDEX  FAST  FULL  SCAN   IDX_VS_ORDER_027   514K  INDEX  FAST  FULL  SCAN   IDX_VS_ORDER_TRACK_017   2665K  INDEX  FAST  FULL  SCAN   PK_VS_ORDER   8673K  New  Plan   8371   ✔INDEX  FULL  SCAN   IDX_VS_PRODUCT_002   84  INDEX  SKIP  SCAN   IDX_VS_ORDER_027   941K  INDEX  SKIP  SCAN   IDX_VS_ORDER_TRACK_017   2899K  INDEX  FAST  FULL  SCAN   PK_VS_ORDER   10M  ©2013 Enkitec 69
70. 70. Cardinality  feedback  is  an  automated   way  the  op.mizer  tries  to   "self-­‐correct"  and  produce  plans   that  perform  op.mally.  ©2013 Enkitec 70