• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Managing Statistics for Optimal Query Performance
 

Managing Statistics for Optimal Query Performance

on

  • 5,649 views

Half the battle of writing good SQL is in understanding how the Oracle query optimizer analyzes your code and applies statistics in order to derive the “best” execution plan. The other half of ...

Half the battle of writing good SQL is in understanding how the Oracle query optimizer analyzes your code and applies statistics in order to derive the “best” execution plan. The other half of the battle is successfully applying that knowledge to the databases that you manage. The optimizer uses statistics as input to develop query execution plans, and so these statistics are the foundation of good plans. If the statistics supplied aren’t representative of your actual data, you can expect bad plans. However, if the statistics are representative of your data, then the optimizer will probably choose an optimal plan.

Statistics

Views

Total Views
5,649
Views on SlideShare
4,924
Embed Views
725

Actions

Likes
14
Downloads
545
Comments
0

19 Embeds 725

http://www.ningoo.net 570
http://karenmorton.blogspot.com 52
http://learnfromadeel.blogspot.com 49
http://www.learnfromadeel.blogspot.com 15
http://www.slideshare.net 10
http://www.zhuaxia.com 5
http://ningoo.net 4
http://xianguo.com 3
http://learnfromadeel.blogspot.fr 2
http://feeds.feedburner.com 2
http://cache.baidu.com 2
http://translate.googleusercontent.com 2
http://zhuaxia.com 2
http://reader.youdao.com 2
http://www.lmodules.com 1
http://learnfromadeel.blogspot.com.es 1
http://learnfromadeel.blogspot.co.at 1
http://learnfromadeel.blogspot.in 1
http://learnfromadeel.blogspot.ae 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Managing Statistics for Optimal Query Performance Managing Statistics for Optimal Query Performance Presentation Transcript

    • Managing Statistics for Optimal Query Performance Karen Morton [email_address] OOW 2009 2009 October 13 1:00pm-2:00pm Moscone South Room 305
    • Your speaker…
      • Karen Morton
        • Sr. Principal Database Engineer
        • Educator, DBA, developer, consultant, researcher, author, speaker, …
      • Come see me…
        • karenmorton.blogspot.com
        • An Oracle user group near you
    • “ I accept no responsibility for statistics, which are a form of magic beyond my comprehension.” — Robertson Davies
    • Math or Magic ?
    • Pick any black card Move UP or DOWN to the nearest red card Move LEFT or RIGHT to the nearest black card Move DIAGONALLY to the nearest red card Move UP or DOWN to the nearest black card
    • Your card is…
    • Math or Magic ?
    • SQL>desc deck Name Null? Type ------------- -------- ------------- SUIT NOT NULL VARCHAR2(10) CARD VARCHAR2(10) COLOR VARCHAR2(5) FACEVAL NOT NULL NUMBER(2)
    • Table: DECK Statistic Current value --------------- ------------------- # rows 52 Blocks 5 Avg Row Len 20 Degree 1 Sample Size 52 Column Name NDV Nulls # Nulls Density Length Low Value High Value ----------- --- ----- ------- ------- ------ ---------- ----------- SUIT 4 N 0 .250000 8 Clubs Spades CARD 13 Y 0 .076923 5 Ace Two COLOR 2 Y 0 .500000 5 Black Red FACEVAL 13 N 0 .076923 3 1 13 Index Name Col# Column Name Unique? Height Leaf Blks Distinct Keys -------------- ----- ------------ ------- ------ ---------- ------------- DECK_PK 1 SUIT Y 1 1 52 2 FACEVAL DECK_CARD_IDX 1 CARD N 1 1 13 DECK_COLOR_IDX 1 COLOR N 1 1 2
    • Card inality The estimated number of rows a query is expected to return. number of rows in table x predicate selectivity
    • select * from deck order by suit, faceval ; Cardinality 52 x 1 = 52
    • SQL>select * from deck order by suit, faceval ; 52 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3142028678 ---------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 52 | 1040 | 2| | 1 | TABLE ACCESS BY INDEX ROWID| DECK | 52 | 1040 | 2| | 2 | INDEX FULL SCAN | DECK_PK | 52 | | 1| ---------------------------------------------------------------------- *
    • select * from deck where color = 'Black' ; Cardinality 52 x 1/2 = 26 *
    • SQL>select * from deck where color = 'Black' ; 26 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1366616955 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 26 | 520 | 2| | 1 | TABLE ACCESS BY INDEX ROWID| DECK | 26 | 520 | 2| |* 2 | INDEX RANGE SCAN | DECK_COLOR_ID | 26 | | 1| ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("COLOR"='Black') *
    • select * from deck where card = 'Ace' and suit = 'Spades' ; Cardinality 52 x 1/13 x 1/4 = 1 *
    • SQL>select * 2 from deck 3 where card = 'Ace' 4 and suit = 'Spades' ; 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2030372774 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 20 | 2| |* 1 | TABLE ACCESS BY INDEX ROWID| DECK | 1 | 20 | 2| |* 2 | INDEX RANGE SCAN | DECK_CARD_IDX | 4 | | 1| ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SUIT"='Spades') 2 - access("CARD"='Ace') *
    • select * from deck where faceval > 10 ; * 52 x High Value - Predicate Value High Value - Low Value ( 13 – 10 ) ( 13 – 1 ) Cardinality = 13
    • SQL>select * 2 from deck 3 where faceval > 10 ; 12 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1303963799 --------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------- | 0 | SELECT STATEMENT | | 13 | 260 | 3| |* 1 | TABLE ACCESS FULL| DECK | 13 | 260 | 3| --------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("FACEVAL">10) *
    • select * from deck where card = 'Ace' ; Cardinality 52 x 1/13 = 4 *
    • SQL>select * from deck where card = :b1 ; 4 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2030372774 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 80 | 2| | 1 | TABLE ACCESS BY INDEX ROWID| DECK | 4 | 80 | 2| |* 2 | INDEX RANGE SCAN | DECK_CARD_IDX | 4 | | 1| ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("CARD"=:B1) *
    • Math or Magic ? Maybe it's a little bit of both!
    • What's the best method for collecting statistics?
    • It depends.
    • Statistics that don't reasonably describe your data
    • … lead to poor cardinality estimates
    • … which leads to poor access path selection
    • … which leads to poor join method selection
    • … which leads to poor join order selection
    • … which leads to poor SQL execution times.
    • Statistics matter!
    • Automatic Manual vs
    • Automatic Collections
    • Objects must change by at least 10%
    • Collection scheduled during nightly maintenance window
    • dbms_stats gather_database_stats_job_proc
    • Prioritizes collection in order by objects which most need updating
    • Most functional when data changes at a slow to moderate rate
    • Volatile tables and large bulk loads are good candidates for manual collection
    • Automatic doesn't mean accurate (for your data)
    • Automatic Collection Defaults
    • SQL>exec dbms_stats.gather_table_stats (ownname=>?, tabname=>?) ; partname NULL cascade DBMS_STATS.AUTO_CASCADE estimate_percent DBMS_STATS.AUTO_SAMPLE_SIZE stattab NULL block_sample FALSE statid NULL method_opt FOR ALL COLUMNS SIZE AUTO statown NULL degree 1 or value based on number of CPUs and initialization parameters force FALSE granularity AUTO (value is based on partitioning type) no_invalidate DBMS_STATS.AUTO_INVALIDATE
    • cascade=> AUTO_CASCADE Allow Oracle to determine whether or not to gather index statistics
    • estimate_percent=> AUTO_SAMPLE_SIZE Allow Oracle to determine sample size
    • method_opt=> FOR ALL COLUMNS SIZE AUTO Allow Oracle to determine when to gather histogram statistics SYS.COL_USAGE$
    • no_invalidate=> AUTO_INVALIDATE Allow Oracle to determine when to invalidate dependent cursors
    • Goal Collect statistics that are "good enough" to meet most needs most of the time.
    • Say you were standing with one foot in the oven and one foot in an ice bucket. According to the percentage people, you would be perfectly comfortable. – Bobby Bragan
    • Collections Manual
    • dbms_stats gather_*_stats * = database, schema, table, index, etc.
    • Is it common for your users to get slammed with performance problems shortly after statistics are updated?
    • Does performance decline before a 10% data change occurs?
    • Do low and high values for a column change significantly between automatic collections?
    • Does your application performance seem "sensitive" to changing user counts as well as data volume changes?
    • If you answered "Yes" to one or more of these questions...
    • your application's unique needs may be best served with manual collection.
    • Test. Test. Test.
    • Dynamic Sampling
    • optimizer_dynamic_sampling parameter dynamic_sampling hint
    • SQL>create table depend_test as 2 select mod(num, 100) c1, 3 mod(num, 100) c2, 4 mod(num, 75) c3, 5 mod(num, 30) c4 6 from (select level num from dual 7 connect by level <= 10001);   Table created. SQL>exec dbms_stats.gather_table_stats( user, 'depend_test', estimate_percent => null, method_opt => 'for all columns size 1');   PL/SQL procedure successfully completed.
    • Statistic Current value --------------- -------------- # rows 10001 Blocks 28 Avg Row Len 11 Sample Size 10001 Monitoring YES Column NDV Density AvgLen Histogram LowVal HighVal ------- --- ------- ------ --------- ------ ------- C1 100 .010000 3 NONE (1) 0 99 C2 100 .010000 3 NONE (1) 0 99 C3 75 .013333 3 NONE (1) 0 74 C4 30 .033333 3 NONE (1) 0 29
    • SQL>set autotrace traceonly explain SQL>select count(*) from depend_test where c1 = 10;   Execution Plan ---------------------------------------------------------- Plan hash value: 3984367388 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 3 | 8 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | |* 2 | TABLE ACCESS FULL| DEPEND_TEST | 100 | 300 | 8 (0)| 00:00:01 | ----------------------------------------------------------------------------------   Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;C1&quot;=10)
    • SQL>set autotrace traceonly explain SQL>select count(*) from depend_test where c1 = 10 and c2 = 10;   Execution Plan ---------------------------------------------------------- Plan hash value: 3984367388 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 6 | | | |* 2 | TABLE ACCESS FULL| DEPEND_TEST | 1 | 6 | 8 (0)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;C1&quot;=10 AND &quot;C2&quot;=10)
    • SQL>set autotrace traceonly explain SQL>select /*+ dynamic_sampling (4) */ count(*) 2 from depend_test where c1 = 10 and c2 = 10;   Execution Plan ---------------------------------------------------------- Plan hash value: 3984367388 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 8 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 6 | | | |* 2 | TABLE ACCESS FULL| DEPEND_TEST | 100 | 600 | 8 (0)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;C1&quot;=10 AND &quot;C2&quot;=10) Note ----- - dynamic sampling used for this statement
    • 11 g Extended Statistics
    • SQL> select dbms_stats.create_extended_stats(ownname=>user, 2 tabname => 'DEPEND_TEST', 3 extension => '(c1, c2)' ) AS c1_c2_correlation 4 from dual ; C1_C2_CORRELATION ------------------------------------------------------------- SYS_STUF3GLKIOP5F4B0BTTCFTMX0W SQL> exec dbms_stats.gather_table_stats( user, 'depend_test'); PL/SQL procedure successfully completed.
    • SQL> set autotrace traceonly explain SQL> select count(*) from depend_test where c1 = 10 and c2 = 10; Execution Plan ---------------------------------------------------------- Plan hash value: 3984367388 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 9 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 6 | | | |* 2 | TABLE ACCESS FULL| DEPEND_TEST | 100 | 600 | 9 (0)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;C1&quot;=10 AND &quot;C2&quot;=10)
    • Setting Statistics
    • dbms_stats set_column_stats set_index_stats set_table_stats
    • It's OK. Really.
    • Why guess (i.e. gather stats) when you know!
    • Common Performance Problems
    • 3 areas where non-representative statistics cause problems
      • Data Skew
      1
    • The optimizer assumes uniform distribution of column values.
    • Color column - uniform distribution
    • Color column – skewed distribution
    • Data skew must be identified with a histogram.
    • Table: obj_tab Statistic Current value --------------- -------------- # rows 1601874 Blocks 22321 Avg Row Len 94 Sample Size 1601874 Monitoring YES Column: object_type (has 36 distinct values) OBJECT_TYPE PCT_TOTAL ------------------------------- --------- WINDOW GROUP - PROGRAM .00-.02 EVALUATION CONTEXT - XML SCHEMA .03-.05 OPERATOR - PROCEDURE .11-.17 LIBRARY - TYPE BODY .30-.35 FUNCTION - INDEX PARTITION .54-.64 JAVA RESOURCE - PACKAGE 1.54-1.69 TABLE - VIEW 3.44-7.35 JAVA CLASS 32.80 SYNONYM 40.01 100 % Statistics FOR ALL COLUMNS SIZE 1
    • PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- SQL_ID 16yy3p8sstr28, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id, status from obj_tab where object_type = ' PROCEDURE ' Plan hash value: 2862749165 -------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | -------------------------------------------------------------------------------- | 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 2720 | 1237 | |* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 2720 | 193 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;='PROCEDURE') R = .06 seconds E-Rows = 1/36 x 1,601,874
    • PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- SQL_ID 9u6ppkh5mhr8v, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id, status from obj_tab where object_type = ' SYNONYM ' Plan hash value: 2862749165 -------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | -------------------------------------------------------------------------------- | 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 44497 | 640K| 104K| |* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 44497 | 640K | 44082 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;='SYNONYM') R = 14.25 seconds E-Rows = 1/36 x 1,601,874
    • Re-collect statistics 100 % FOR ALL COLUMNS SIZE AUTO
    • PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- SQL_ID 16yy3p8sstr28, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id, status from obj_tab where object_type = ' PROCEDURE ' Plan hash value: 2862749165 -------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | -------------------------------------------------------------------------------- | 1 | TABLE ACCESS BY INDEX ROWID| OBJ_TAB | 2720 | 2720 | 1237 | |* 2 | INDEX RANGE SCAN | OBJ_TYPE_IDX | 2720 | 2720 | 193 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;='PROCEDURE') R = .07 seconds E-Rows = histogram x 1,601,874
    • PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- SQL_ID 9u6ppkh5mhr8v, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ owner, object_name, object_type, object_id, status from obj_tab where object_type = ' SYNONYM ' Plan hash value: 2748991475 ----------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | ----------------------------------------------------------------- |* 1 | TABLE ACCESS FULL| OBJ_TAB | 640K | 640K | 64263 | ----------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(&quot;OBJECT_TYPE&quot;='SYNONYM') R = 3.36 seconds E-Rows = histogram x 1,601,874 vs 14.25 seconds
    • Histograms are important for more reasons than just helping determine the access method.
    • 2 Bind Peeking
    • During hard parse, the optimizer &quot;peeks&quot; at the bind value and uses it to determine the execution plan.
    • But, what if your data is skewed? 11g
    • SQL> variable objtype varchar2(19) SQL> exec :objtype := ' PROCEDURE ';   PL/SQL procedure successfully completed.   SQL> select /*+ gather_plan_statistics */ count(*) ct 2 from big_tab 3 where object_type = :objtype ;   CT --------------- 4416   1 row selected.   SQL> SQL> select * from table (dbms_xplan.display_cursor('211078a9adzak',0,'ALLSTATS LAST'));
    • PLAN_TABLE_OUTPUT --------------------------------------- SQL_ID 211078a9adzak, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ count(*) ct from big_tab where object_type = :objtype Plan hash value: 154074842 ------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | ------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | 16 | |* 2 | INDEX RANGE SCAN| BIG_OBJTYPE_IDX | 4416 | 4416 | 16 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;=:OBJTYPE)
    • SQL> select child_number, executions, buffer_gets, 2 is_bind_sensitive, is_bind_aware, is_shareable 3 from v$sql where sql_id = '211078a9adzak' ; CHILD_NUMBER = 0 EXECUTIONS = 1 BUFFER_GETS = 16 IS_BIND_SENSITIVE = N IS_BIND_AWARE = N IS_SHAREABLE = Y
    • SQL> variable objtype varchar2(19) SQL> exec :objtype := ' SYNONYM ';   PL/SQL procedure successfully completed.   SQL> select /*+ gather_plan_statistics */ count(*) ct 2 from big_tab 3 where object_type = :objtype ;   CT ---------------- 854176   1 row selected.   SQL> SQL> select * from table (dbms_xplan.display_cursor('211078a9adzak',0,'ALLSTATS LAST'));
    • PLAN_TABLE_OUTPUT --------------------------------------- SQL_ID 211078a9adzak, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ count(*) ct from big_tab where object_type = :objtype Plan hash value: 154074842 ------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | ------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | 2263 | |* 2 | INDEX RANGE SCAN| BIG_OBJTYPE_IDX | 4416 | 854K | 2263 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;=:OBJTYPE)
    • SQL> select child_number, executions, buffer_gets, 2 is_bind_sensitive, is_bind_aware, is_shareable 3 from v$sql where sql_id = '211078a9adzak' ; CHILD_NUMBER = 0 EXECUTIONS = 2 BUFFER_GETS = 2279 (2263 + 16) IS_BIND_SENSITIVE = Y IS_BIND_AWARE = N IS_SHAREABLE = Y
    • SQL> variable objtype varchar2(19) SQL> exec :objtype := ' SYNONYM ';   PL/SQL procedure successfully completed. PLAN_TABLE_OUTPUT --------------------------------------- SQL_ID 211078a9adzak, child number 1 ------------------------------------- select /*+ gather_plan_statistics */ count(*) ct from big_tab where object_type = :objtype Plan hash value: 1315022418 ----------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | ----------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | 6016 | |* 2 | INDEX FAST FULL SCAN | BIG_OBJTYPE_IDX | 854K | 854K | 6016 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;=:OBJTYPE)
    • SQL> select child_number, executions, buffer_gets, 2 is_bind_sensitive, is_bind_aware, is_shareable 3 from v$sql where sql_id = '211078a9adzak' ; CHILD_NUMBER = 0 EXECUTIONS = 2 BUFFER_GETS = 2279 IS_BIND_SENSITIVE = Y IS_BIND_AWARE = N IS_SHAREABLE = N CHILD_NUMBER = 1 EXECUTIONS = 1 BUFFER_GETS = 6016 IS_BIND_SENSITIVE = Y IS_BIND_AWARE = Y IS_SHAREABLE = Y
    • SQL> variable objtype varchar2(19) SQL> exec :objtype := ' PROCEDURE ';   PL/SQL procedure successfully completed. PLAN_TABLE_OUTPUT --------------------------------------- SQL_ID 211078a9adzak, child number 2 ------------------------------------- select /*+ gather_plan_statistics */ count(*) ct from big_tab where object_type = :objtype Plan hash value: 154074842 ------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | A-Rows | Buffers | ------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | 16 | |* 2 | INDEX RANGE SCAN| BIG_OBJTYPE_IDX | 4416 | 4416 | 16 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(&quot;OBJECT_TYPE&quot;=:OBJTYPE)
    • SQL> select child_number, executions, buffer_gets, 2 is_bind_sensitive, is_bind_aware, is_shareable 3 from v$sql where sql_id = '211078a9adzak' ; CHILD_NUMBER = 0 EXECUTIONS = 2 BUFFER_GETS = 2279 IS_BIND_SENSITIVE = Y IS_BIND_AWARE = N IS_SHAREABLE = N CHILD_NUMBER = 1 EXECUTIONS = 1 BUFFER_GETS = 6016 IS_BIND_SENSITIVE = Y IS_BIND_AWARE = Y IS_SHAREABLE = Y CHILD_NUMBER = 2 EXECUTIONS = 1 BUFFER_GETS = 16 IS_BIND_SENSITIVE = Y IS_BIND_AWARE = Y IS_SHAREABLE = Y
    • 10 g will create only 1 plan. 11 g will create plans as needed to cover data skew.
    • Handling bind peeking is more of a coding issue than a statistics issue.
      • Incorrect
      • High and Low
      • Values
      3
    • To derive the cardinality estimate for range predicates, the optimizer uses the low and high value statistics.
    • Table: hi_lo_t Statistic Current value --------------- --------------------- # rows 100000 Blocks 180 Avg Row Len 7 Sample Size 100000 Monitoring YES Column NDV Nulls Density AvgLen Histogram LowVal HighVal ------- ------ ----- ------- ------ --------- ------ ------- A 100000 N .000010 5 NONE (1) 10 100009 B 10 Y .100000 3 NONE (1) 9 18 100 % Statistics FOR ALL COLUMNS SIZE 1
    • select count(a) from hi_lo_t where b < 11 ; 11 – 9 18 – 9 100000 rows x .22222 = 22222 Predicate value – Low value High value – Low value ( )
    • select count(a) from hi_lo_t where b < 11 Plan hash value: 3307858660 ------------------------------------------------------------------ | Id | Operation | Name | E-Rows | A-Rows | Buffers | ------------------------------------------------------------------ | 1 | SORT AGGREGATE | | 1 | 1 | 184 | |* 2 | TABLE ACCESS FULL| HI_LO_T | 22222 | 20000 | 184 | ------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;B&quot;<11)
    • select count(a) from hi_lo_t where b < 4 ; 4 – 9 18 – 9 100000 rows x .04444 = 4444 Predicate value – Low value High value – Low value ( ) .10 x 1 +
    • select count(a) from hi_lo_t where b < 4 Plan hash value: 3307858660 ------------------------------------------------------------------ | Id | Operation | Name | E-Rows | A-Rows | Buffers | ------------------------------------------------------------------ | 1 | SORT AGGREGATE | | 1 | 1 | 184 | |* 2 | TABLE ACCESS FULL| HI_LO_T | 4444 | 0 | 184 | ------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;B&quot;<4)
    • METHOD_OPT=> 'FOR ALL INDEXED COLUMNS' Be cautious of using this!
    • If column is not indexed, no statistics are collected.
    • select count(a) from hi_lo_t where b = 12 Plan hash value: 3307858660 ------------------------------------------------------------------ | Id | Operation | Name | E-Rows | A-Rows | Buffers | ------------------------------------------------------------------ | 1 | SORT AGGREGATE | | 1 | 1 | 184 | |* 2 | TABLE ACCESS FULL| HI_LO_T | 1000 | 10000 | 184 | ------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(&quot;B&quot;=12) Without statistics, a 10% default is used.
    • Result: Cardinality estimates that are orders of magnitude &quot;off&quot;
    • Conclusion
    • Why guess when you can know .
    • Thoroughly test and document your statistics collection strategy.
    • Check default options particularly when upgrading. Things change .
    • Regularly check statistics and compare to previous collections for any anomalies. 10.2.0.4 and above dbms_stats.diff_table_stats_*
    • Don't ignore your data.
    • There is no single strategy that works best for everyone.
    • Statistics must reasonably represent your actual data.
    • Understanding basic optimizer statistics computations is key.
    • The more you know , the more likely you are to succeed.
    • Thank You!
    • Q & Q U E S T I O N S A N S W E R S
    • Magic
    • Math