Your SlideShare is downloading. ×
CBO Basics: Cardinality
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

CBO Basics: Cardinality

1,662
views

Published on

Published in: Business, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,662
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
51
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Picture from Jonathan Lewis’s Book <<Cost based Oracle Fundamentals>>
  • Transcript

    • 1. CBO Basics: Cardinality
      Sidney Chen@zha-dba
    • 2. Agenda
      Cardinality
      The Cardinality under Various Prediction
      Case study: incorrect high value
    • 3. Cardinality
      The estimated number of rows
      a query is expected to return.
      Number of rows in table
      x
      Predicate selectivity
    • 4. Imaging there are 1200 people attend the dba weekly meeting,
      They are randomly born in 12 month and comes from 10 citites, given the data the even distributed.
      create table audience as
      select
      mod(rownum-1,12) + 1 month_no,
      mod(rownum-1,10) + 1 city_no
      from
      all_objects
      where
      rownum <= 1200;
    • 5. sid@CS10G> select month_no, count(*) from audience group by month_no order by month_no;
      MONTH_NO COUNT(*)
      ---------- ----------
      1 100
      2 100
      3 100
      4 100
      5 100
      6 100
      7 100
      8 100
      9 100
      10 100
      11 100
      12 100
      12 rows selected.
    • 6. sid@CS10G> select city_no, count(*) from audience group by city_no order by city_no;
      CITY_NO COUNT(*)
      ---------- ----------
      1 120
      2 120
      3 120
      4 120
      5 120
      6 120
      7 120
      8 120
      9 120
      10 120
      10 rows selected.
    • 7. sid@CS10G> @indsid.audience;
      OWNER TABLE_NAME BLOCKS NUM_ROWS AVG_ROW_LEN
      ----- ---------- ---------- ---------- -----------
      SID AUDIENCE 5 1200 6
      sid@CS10G> @descsid.audience;
      Column Name NUM_DISTINCT DENSITY NUM_BUCKETS Low High
      ------------ ------------ ---------- ----------- ---- ----
      MONTH_NO 12 .083333333 1 1 12
      CITY_NO 10 .1 1 1 10
    • 8. Critical info
      NDK: number of distinct keys
      Density = 1/NDK, (0.1 = 1/10) (0.083333333 = 1/12)
      NUM_BUCKETS=1, there is no histogram gather
      NUM_BUCKETS > 1, histogram gathered
    • 9. select month_no from audience where month_no=12;
      Cardinality
      1200 * (1/12) = 100
    • 10. sid@CS10G> select month_no from audience where month_no=12;
      100 rows selected.
      Execution Plan
      ----------------------------------------------------------
      Plan hash value: 2423062965
      -------------------------------------------------------------------
      | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
      -------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 100 | 300 | 3 (0)|
      |* 1 | TABLE ACCESS FULL| AUDIENCE | 100 | 300 | 3 (0)|
      -------------------------------------------------------------------
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      1 - filter("MONTH_NO"=12)
    • 11. select month_no from audience where city_no=1;
      Cardinality
      1200 * (1/10) = 120
    • 12. sid@CS10G> select month_no from audience where city_no=1;
      120 rows selected.
      Execution Plan
      ----------------------------------------------------------
      Plan hash value: 2423062965
      -------------------------------------------------------------------
      | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
      -------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 120 | 720 | 3 (0)|
      |* 1 | TABLE ACCESS FULL| AUDIENCE | 120 | 720 | 3 (0)|
      -------------------------------------------------------------------
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      1 - filter("CITY_NO"=1)
    • 13. select month_no from audience where month_no > 9;
      Cardinality
      1200 * ( (12-9)/(12-1) ) = 327
    • 14. sid@CS10G> select month_no from audience where month_no > 9;
      300 rows selected.
      Execution Plan
      ----------------------------------------------------------
      Plan hash value: 2423062965
      -------------------------------------------------------------------
      | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
      -------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 327 | 981 | 3 (0)|
      |* 1 | TABLE ACCESS FULL| AUDIENCE | 327 | 981 | 3 (0)|
      -------------------------------------------------------------------
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      1 - filter("MONTH_NO">9)
    • 15. Equality: If Out of range
      explain plan set statement_id = '12' for select month_no from audience where month_no = 12;
      explain plan set statement_id = '13' for select month_no from audience where month_no = 13;
      explain plan set statement_id = '14' for select month_no from audience where month_no = 14;
      explain plan set statement_id = '15' for select month_no from audience where month_no = 15;
      explain plan set statement_id = '16' for select month_no from audience where month_no = 16;
      explain plan set statement_id = '17' for select month_no from audience where month_no = 17;
      explain plan set statement_id = '18' for select month_no from audience where month_no = 18;
      explain plan set statement_id = '19' for select month_no from audience where month_no = 19;
      explain plan set statement_id = '20' for select month_no from audience where month_no = 20;
    • 16. sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
      STATEMENT_ID CARDINALITY
      ------------ -----------
      12 100
      13 91
      14 82
      15 73
      1664
      17 55
      18 45
      19 36
      20 27
      9 rows selected.
    • 17. Range: If Out of range
      explain plan set statement_id = '12' for select month_no from audience where month_no > 12;
      explain plan set statement_id = '13' for select month_no from audience where month_no > 13;
      explain plan set statement_id = '14' for select month_no from audience where month_no > 14;
      explain plan set statement_id = '15' for select month_no from audience where month_no > 15;
      explain plan set statement_id = '16' for select month_no from audience where month_no > 16;
      explain plan set statement_id = '17' for select month_no from audience where month_no > 17;
      explain plan set statement_id = '18' for select month_no from audience where month_no > 18;
      explain plan set statement_id = '19' for select month_no from audience where month_no > 19;
      explain plan set statement_id = '20' for select month_no from audience where month_no > 20;
    • 18. sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
      STATEMENT_ID CARDINALITY
      ------------ -----------
      12 100
      13 91
      14 82
      15 73
      1664
      17 55
      18 45
      19 36
      20 27
      9 rows selected.
    • 19. The far from low/high range, the less you are to find data
    • 20. Bad news
      If you have sequence, or time-based column in predicate(such as last modified date:last_mod_dt), and haven’t been keeping the statistics up to date
      The Cardinality will drop as time passes, if you using equality and range on that column
    • 21. sid@CS10G> select month_no from audience where month_no > 16;
      no rows selected
      Execution Plan
      ----------------------------------------------------------
      Plan hash value: 2423062965
      -------------------------------------+-----------------------------------+
      | Id | Operation | Name | Rows | Bytes | Cost | Time |
      -------------------------------------+-----------------------------------+
      | 0 | SELECT STATEMENT | | | | 3 | |
      | 1 | TABLE ACCESS FULL | AUDIENCE| 64 | 192 | 3 | 00:00:01 |
      -------------------------------------+-----------------------------------+
      Predicate Information:
      ----------------------
      1 - filter("MONTH_NO"=16)
    • 22. Using 10053 event to confirm
      sid@CS10G> @53on
      alter session set events '10053 trace name context forever, level 1';
      Session altered.
      sid@CS10G> explain plan for select month_no from audience where month_no > 16;
      Explained.
      sid@CS10G> @53off
      sid@CS10G> @tracefile
      TRACEFILE
      ----------------------------------------------------------------------------------------------------
      /home/u02/app/oracle/product/11.1.0/db_1/admin/cs10g/udump/cs10g_ora_24947.trc
    • 23. SINGLE TABLE ACCESS PATH
      -----------------------------------------
      BEGIN Single Table Cardinality Estimation
      -----------------------------------------
      Column (#1): MONTH_NO(NUMBER)
      AvgLen: 3.00 NDV: 12 Nulls: 0 Density: 0.083333 Min: 1 Max: 12
      Using prorated density: 0.05303 of col #1 as selectivity of out-of-range value pred
      Table: AUDIENCE Alias: AUDIENCE
      Card: Original: 1200 Rounded: 64 Computed: 63.64 Non Adjusted: 63.64
      -----------------------------------------
      END Single Table Cardinality Estimation
      -----------------------------------------
    • 24. Between: If Out of range
      explain plan set statement_id = '12' for select month_no from audience where month_nobetween 13 and 15;
      explain plan set statement_id = '14' for select month_no from audience where month_nobetween 14 and 16;
      explain plan set statement_id = '15' for select month_no from audience where month_nobetween 15 and 17;
      explain plan set statement_id = '16' for select month_no from audience where month_no between 13 and 20;
      explain plan set statement_id = '17' for select month_no from audience where month_no between 14 and 21;
      explain plan set statement_id = '18' for select month_no from audience where month_no between 15 and 22;
      explain plan set statement_id = '19' for select month_no from audience where month_no between 16 and 23;
    • 25. sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
      STATEMENT_ID CARDINALITY
      ------------ -----------
      12 100
      13 100
      14 100
      15 100
      16 100
      17 100
      18 100
      19 100
      20 100
      9 rows selected.
    • 26. Case: incorrect low/high value
    • 27. SELECT count('1') RECCOUNT
      FROM Test_ILM_INTERACTIONt0
      JOIN Test_ILM_INTERACTION_TYPEt1 ON t0.INTERACTION_TYP =
      t1.INTERACTION_TYP
      JOIN Test_ILM_INTERACTION_REFt3 ON t0.interaction_uuid =
      t3.interaction_uuid
      WHERE t1.IS_VIEWABLE = 1
      AND ((t0.DOMAIN_NME = 'DOMAIN_A') or
      (T0.DOMAIN_NME = 'DOMAIN_B' AND
      T0.APPLICATION_NME = 'APPLICATION_C'))
      AND (t3.REF_CDE = 'BK_NUMBER' AND t3.REF_KEY_VALUE = '2389301444')
      AND t0.INTERACTION_DT BETWEEN
      TO_DATE('01-06-2011 16:00:00', 'DD-MM-YYYY HH24:MI:SS')
      AND
      TO_DATE('16-06-2011 15:59:59', 'DD-MM-YYYY HH24:MI:SS')
    • 28. ---------------------------------------------------------------------+----------------
      | Id | Operation | Name | Rows | Cost |
      ---------------------------------------------------------------------+----------------
      | 0 | SELECT STATEMENT | | | 13 |
      | 1 | SORT AGGREGATE | | 1 | |
      | 2 | NESTED LOOPS | | 1 | 13 |
      | 3 | NESTED LOOPS | | 1 | 12 |
      | 4 | INDEX RANGE SCAN | Test_ILM_INTERACTION_IDX3 | 4 | 4 |
      | 5 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_REF_PK| 1 | 2 |
      | 6 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 |
      | 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 |
      ---------------------------------------------------------------------+----------------
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      4 - access("T0"."INTERACTION_DT">=TO_DATE('2011-06-01 05:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
      "T0"."INTERACTION_DT"<=TO_DATE('2011-06-16 04:59:59', 'yyyy-mm-dd hh24:mi:ss'))
      filter("T0"."DOMAIN_NME"='DOMAIN_A' OR "T0"."APPLICATION_NME"='APPLICATION_C' AND
      "T0"."DOMAIN_NME"='DOMAIN_B')
      5 - access("T0"."INTERACTION_UUID"="T3"."INTERACTION_UUID" AND "T3"."REF_CDE"='BL_NUMBER' AND
      "T3"."REF_KEY_VALUE"=2389301444')
      6 - filter("T1"."IS_VIEWABLE"=1)
      7 - access("T0"."INTERACTION_TYP"="T1"."INTERACTION_TYP")
    • 29. sys@CS2PRD> @descTestowner.Test_ilm_interaction
      Column Name NUM_DISTINCT DENSITY Low High
      ------------------ ------------ ---------- -------------------- --------------------
      INTERACTION_DT 7583898 1.3186E-07 2010-05-07 23:45:47 2011-05-31 15:31:35
      sys@CS2PRD> @indTestowner.Test_ilm_interaction
      TABLE_NAME INDEX_NAME POS# COLUMN_NAME
      ----------------------- --------------------------- ---- ----------------
      Test_ILM_INTERACTION Test_ILM_INTERACTION_IDX3 1 INTERACTION_DT
      2 COMPANY_ID
      3 APPLICATION_NME
      4 DOMAIN_NME
      5 INTERACTION_TYP
      6 INTERACTION_UUID
    • 30. select to_date(2455714,'J') max_date from dual;
      MAX_DATE
      -------------------
      2011-06-01 00:00:00
      10053 trace
      Column (#4): INTERACTION_DT(DATE)
      AvgLen: 8.00 NDV: 7583898 Nulls: 0 Density: 1.3186e-07 Min: 2455325 Max: 2455714
      SINGLE TABLE ACCESS PATH
      Using prorated density: 1.3151e-07 of col#4 as selectivity of out-of-range value pred
      Table: Test_ILM_INTERACTIONAlias: T0
      Card: Original: 53138344 Rounded: 4 Computed: 3.75 Non Adjusted: 3.75
      Access Path: index (IndexOnly)
      Index: Test_ILM_INTERACTION_IDX3
      resc_io: 4.00 resc_cpu: 29886
      ix_sel: 1.3151e-07 ix_sel_with_filters: 7.0628e-08
      Cost: 4.00 Resp: 4.00 Degree: 1
      ****** trying bitmap/domain indexes ******
      Best:: AccessPath: IndexRange Index: Test_ILM_INTERACTION_IDX3
      Cost: 4.00 Degree: 1 Resp: 4.00 Card: 3.75 Bytes: 0
    • 31. Solution:dbms_stats.set_colum_stats
      m_high := ADD_MONTHS(m_high, 1);
      m_val_array := DBMS_STATS.DATEARRAY(m_low, m_high);
      dbms_stats.prepare_column_values(
      srec => m_statrec,
      datevals => m_val_array
      );
      dbms_stats.set_column_stats(
      ownname => NULL,
      tabname => '&&TABLE_NAME',
      colname => '&&COLUMN_NAME',
      distcnt => m_distcnt,
      density => m_density,
      nullcnt => m_nullcnt,
      srec => m_statrec,
      avgclen => m_avgclen,
      no_invalidate => false
      );
      Pls check the hack_stats_3.sql for the full script
    • 32. sys@CS2PRD> @descTestowner.Test_ilm_interaction
      Column Name NUM_DISTINCT DENSITY Low High
      ------------------ ------------ ---------- -------------------- --------------------
      INTERACTION_DT 7583898 1.3186E-07 2010-05-07 23:45:47 2011-06-31 15:31:35
    • 33. ---------------------------------------------------------------------------------------------
      | Id | Operation | Name | Rows | Cost (%CPU)|
      ---------------------------------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 1 | 14 (0)|
      | 1 | SORT AGGREGATE | | 1 | |
      | 2 | NESTED LOOPS | | 1 | 14 (0)|
      | 3 | NESTED LOOPS | | 1 | 13 (0)|
      | 4 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION_REF| 1 | 10 (0)|
      |* 5 | INDEX RANGE SCAN | Test_ILM_INTERACTION_REF_IDX1 | 9 | 4 (0)|
      |* 6 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION| 1 | 3 (0)|
      |* 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_PK| 1 | 2 (0)|
      |* 8 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 (0)|
      |* 9 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 (0)|
      ---------------------------------------------------------------------------------------------
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      5 - access("T3"."REF_CDE"='BK_NUMBER' AND "T3"."REF_KEY_VALUE"='2389301444')
      6 - filter("T0"."INTERACTION_DT">=TO_DATE('2011-06-01 16:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
      ("T0"."DOMAIN_NME"='DOMAIN_A' OR "T0"."APPLICATION_NME"='APPLICATION_C' AND
      "T0"."DOMAIN_NME"='DOMAIN_B') AND "T0"."INTERACTION_DT"<=TO_DATE('2011-06-16 15:59:59', 'yyyy-mm-dd
      hh24:mi:ss'))
      7 - access("T0"."INTERACTION_UUID"="T3"."INTERACTION_UUID")
      8 - filter("T1"."IS_VIEWABLE"=1)
      9 - access("T0"."INTERACTION_TYP"="T1"."INTERACTION_TYP")
    • 34. select to_date(2455744,'J') max_date from dual;
      MAX_DATE
      -------------------
      2011-07-01 00:00:00
      10053 trace
      ***************************************
      SINGLE TABLE ACCESS PATH
      Column (#9): DOMAIN_NME(VARCHAR2)
      AvgLen: 12.00 NDV: 10 Nulls: 53692 Density: 9.3545e-09
      Histogram: Freq #Bkts: 10 UncompBkts: 5973 EndPtVals: 10
      Column (#2): APPLICATION_NME(VARCHAR2)
      AvgLen: 17.00 NDV: 30 Nulls: 0 Density: 9.3451e-09
      Histogram: Freq #Bkts: 30 UncompBkts: 5979 EndPtVals: 30
      Column (#4): INTERACTION_DT(DATE)
      AvgLen: 8.00 NDV: 7583898 Nulls: 0 Density: 1.3186e-07 Min: 2455325 Max: 2455744
      Table: Test_ILM_INTERACTIONAlias: T0
      Card: Original: 53138344 Rounded: 1024006 Computed: 1024006.31 Non Adjusted: 1024006.31
      Access Path: index (IndexOnly)
      Index: Test_ILM_INTERACTION_IDX3
      resc_io: 28694.00 resc_cpu: 574312799
      ix_sel: 0.035829 ix_sel_with_filters: 0.019271
      Cost: 28722.69 Resp: 28722.69 Degree: 1
      ****** trying bitmap/domain indexes ******
      Best:: AccessPath: IndexRange Index: Test_ILM_INTERACTION_IDX3
      Cost: 28722.69 Degree: 1 Resp: 28722.69 Card: 1024006.31 Bytes: 0
      ***************************************
    • 35. Performance after tuning
    • 36. The functional scripts
    • 37. ThanksQ&A
    • 38. ---------------------------------------------------------------------+----------------
      | Id | Operation | Name | Rows | Cost |
      ---------------------------------------------------------------------+----------------
      | 0 | SELECT STATEMENT | | | 13 |
      | 1 | SORT AGGREGATE | | 1 | |
      | 2 | NESTED LOOPS | | 1 | 13 |
      | 3 | NESTED LOOPS | | 1 | 12 |
      | 4 | INDEX RANGE SCAN | Test_ILM_INTERACTION_IDX3 | 4 | 4 |
      | 5 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_REF_PK| 1 | 2 |
      | 6 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 |
      | 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 |
      ---------------------------------------------------------------------+----------------
      ---------------------------------------------------------------------------------------------
      | Id | Operation | Name | Rows | Cost (%CPU)|
      ---------------------------------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 1 | 14 (0)|
      | 1 | SORT AGGREGATE | | 1 | |
      | 2 | NESTED LOOPS | | 1 | 14 (0)|
      | 3 | NESTED LOOPS | | 1 | 13 (0)|
      | 4 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION_REF| 1 | 10 (0)|
      |* 5 | INDEX RANGE SCAN | Test_ILM_INTERACTION_REF_IDX1 | 9 | 4 (0)|
      |* 6 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION| 1 | 3 (0)|
      |* 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_PK| 1 | 2 (0)|
      |* 8 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 (0)|
      |* 9 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 (0)|
      ---------------------------------------------------------------------------------------------
    • 39. B-tree Access Cost
      cost =
      blevel +
      ceiling(leaf_blocks * effective index selectivity) +
      ceiling(clustering_factor * effective table selectivity)
      =
      blevel +
      ceiling(leaf_blocks * ix_sel)+
      ceiling(clustering_factor *ix_sel_with_filter)
    • 40. Index Stats::
      Index: Test_ILM_INTERACTION_IDX3 Col#: 4 10 2 9 7 1
      LVLS: 3 #LB: 800771 #DK: 51629818 LB/K: 1.00 DB/K: 1.00 CLUF: 27495746.00
      Access Path: index (IndexOnly)
      Index: Test_ILM_INTERACTION_IDX3
      resc_io: 4.00 resc_cpu: 29886
      ix_sel: 1.3151e-07 ix_sel_with_filters: 7.0734e-08
      Cost: 4.00 Resp: 4.00 Degree: 1
      sys@CS10G> select 3 + ceil(800771 * 0.00000013151) cost from dual;
      COST
      ----------
      4
      Index only, no need to access table
    • 41. Index Stats::
      Index: Test_ILM_INTERACTION_REF_IDX1 Col#: 2 3
      LVLS: 3 #LB: 872428 #DK: 10567436 LB/K: 1.00 DB/K: 5.00 CLUF: 60057635.00
      Access Path: index (AllEqRange)
      Index: Test_ILM_INTERACTION_REF_IDX1
      resc_io: 10.00 resc_cpu: 74724
      ix_sel: 9.4630e-08 ix_sel_with_filters: 9.4630e-08
      Cost: 10.00 Resp: 10.00 Degree: 1
      sys@CS10G> select 3 + ceil(872428 * 0.00000009463) + ceil(60057535 * 0.00000009463) cost from dual;
      COST
      ----------
      10