• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
CBO Basics: Cardinality
 

CBO Basics: Cardinality

on

  • 1,702 views

 

Statistics

Views

Total Views
1,702
Views on SlideShare
1,101
Embed Views
601

Actions

Likes
0
Downloads
39
Comments
0

5 Embeds 601

http://sid.gd 582
http://dbsid.com 11
http://feed.feedsky.com 6
http://www.slideshare.net 1
http://xianguo.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Picture from Jonathan Lewis’s Book <>

CBO Basics: Cardinality CBO Basics: Cardinality Presentation Transcript

  • CBO Basics: Cardinality
    Sidney Chen@zha-dba
  • Agenda
    Cardinality
    The Cardinality under Various Prediction
    Case study: incorrect high value
  • Cardinality
    The estimated number of rows
    a query is expected to return.
    Number of rows in table
    x
    Predicate selectivity
  • Imaging there are 1200 people attend the dba weekly meeting,
    They are randomly born in 12 month and comes from 10 citites, given the data the even distributed.
    create table audience as
    select
    mod(rownum-1,12) + 1 month_no,
    mod(rownum-1,10) + 1 city_no
    from
    all_objects
    where
    rownum <= 1200;
  • sid@CS10G> select month_no, count(*) from audience group by month_no order by month_no;
    MONTH_NO COUNT(*)
    ---------- ----------
    1 100
    2 100
    3 100
    4 100
    5 100
    6 100
    7 100
    8 100
    9 100
    10 100
    11 100
    12 100
    12 rows selected.
  • sid@CS10G> select city_no, count(*) from audience group by city_no order by city_no;
    CITY_NO COUNT(*)
    ---------- ----------
    1 120
    2 120
    3 120
    4 120
    5 120
    6 120
    7 120
    8 120
    9 120
    10 120
    10 rows selected.
  • sid@CS10G> @indsid.audience;
    OWNER TABLE_NAME BLOCKS NUM_ROWS AVG_ROW_LEN
    ----- ---------- ---------- ---------- -----------
    SID AUDIENCE 5 1200 6
    sid@CS10G> @descsid.audience;
    Column Name NUM_DISTINCT DENSITY NUM_BUCKETS Low High
    ------------ ------------ ---------- ----------- ---- ----
    MONTH_NO 12 .083333333 1 1 12
    CITY_NO 10 .1 1 1 10
  • Critical info
    NDK: number of distinct keys
    Density = 1/NDK, (0.1 = 1/10) (0.083333333 = 1/12)
    NUM_BUCKETS=1, there is no histogram gather
    NUM_BUCKETS > 1, histogram gathered
  • select month_no from audience where month_no=12;
    Cardinality
    1200 * (1/12) = 100
  • sid@CS10G> select month_no from audience where month_no=12;
    100 rows selected.
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2423062965
    -------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
    -------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 100 | 300 | 3 (0)|
    |* 1 | TABLE ACCESS FULL| AUDIENCE | 100 | 300 | 3 (0)|
    -------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter("MONTH_NO"=12)
  • select month_no from audience where city_no=1;
    Cardinality
    1200 * (1/10) = 120
  • sid@CS10G> select month_no from audience where city_no=1;
    120 rows selected.
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2423062965
    -------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
    -------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 120 | 720 | 3 (0)|
    |* 1 | TABLE ACCESS FULL| AUDIENCE | 120 | 720 | 3 (0)|
    -------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter("CITY_NO"=1)
  • select month_no from audience where month_no > 9;
    Cardinality
    1200 * ( (12-9)/(12-1) ) = 327
  • sid@CS10G> select month_no from audience where month_no > 9;
    300 rows selected.
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2423062965
    -------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
    -------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 327 | 981 | 3 (0)|
    |* 1 | TABLE ACCESS FULL| AUDIENCE | 327 | 981 | 3 (0)|
    -------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    1 - filter("MONTH_NO">9)
  • Equality: If Out of range
    explain plan set statement_id = '12' for select month_no from audience where month_no = 12;
    explain plan set statement_id = '13' for select month_no from audience where month_no = 13;
    explain plan set statement_id = '14' for select month_no from audience where month_no = 14;
    explain plan set statement_id = '15' for select month_no from audience where month_no = 15;
    explain plan set statement_id = '16' for select month_no from audience where month_no = 16;
    explain plan set statement_id = '17' for select month_no from audience where month_no = 17;
    explain plan set statement_id = '18' for select month_no from audience where month_no = 18;
    explain plan set statement_id = '19' for select month_no from audience where month_no = 19;
    explain plan set statement_id = '20' for select month_no from audience where month_no = 20;
  • sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
    STATEMENT_ID CARDINALITY
    ------------ -----------
    12 100
    13 91
    14 82
    15 73
    1664
    17 55
    18 45
    19 36
    20 27
    9 rows selected.
  • Range: If Out of range
    explain plan set statement_id = '12' for select month_no from audience where month_no > 12;
    explain plan set statement_id = '13' for select month_no from audience where month_no > 13;
    explain plan set statement_id = '14' for select month_no from audience where month_no > 14;
    explain plan set statement_id = '15' for select month_no from audience where month_no > 15;
    explain plan set statement_id = '16' for select month_no from audience where month_no > 16;
    explain plan set statement_id = '17' for select month_no from audience where month_no > 17;
    explain plan set statement_id = '18' for select month_no from audience where month_no > 18;
    explain plan set statement_id = '19' for select month_no from audience where month_no > 19;
    explain plan set statement_id = '20' for select month_no from audience where month_no > 20;
  • sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
    STATEMENT_ID CARDINALITY
    ------------ -----------
    12 100
    13 91
    14 82
    15 73
    1664
    17 55
    18 45
    19 36
    20 27
    9 rows selected.
  • The far from low/high range, the less you are to find data
  • Bad news
    If you have sequence, or time-based column in predicate(such as last modified date:last_mod_dt), and haven’t been keeping the statistics up to date
    The Cardinality will drop as time passes, if you using equality and range on that column
  • sid@CS10G> select month_no from audience where month_no > 16;
    no rows selected
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2423062965
    -------------------------------------+-----------------------------------+
    | Id | Operation | Name | Rows | Bytes | Cost | Time |
    -------------------------------------+-----------------------------------+
    | 0 | SELECT STATEMENT | | | | 3 | |
    | 1 | TABLE ACCESS FULL | AUDIENCE| 64 | 192 | 3 | 00:00:01 |
    -------------------------------------+-----------------------------------+
    Predicate Information:
    ----------------------
    1 - filter("MONTH_NO"=16)
  • Using 10053 event to confirm
    sid@CS10G> @53on
    alter session set events '10053 trace name context forever, level 1';
    Session altered.
    sid@CS10G> explain plan for select month_no from audience where month_no > 16;
    Explained.
    sid@CS10G> @53off
    sid@CS10G> @tracefile
    TRACEFILE
    ----------------------------------------------------------------------------------------------------
    /home/u02/app/oracle/product/11.1.0/db_1/admin/cs10g/udump/cs10g_ora_24947.trc
  • SINGLE TABLE ACCESS PATH
    -----------------------------------------
    BEGIN Single Table Cardinality Estimation
    -----------------------------------------
    Column (#1): MONTH_NO(NUMBER)
    AvgLen: 3.00 NDV: 12 Nulls: 0 Density: 0.083333 Min: 1 Max: 12
    Using prorated density: 0.05303 of col #1 as selectivity of out-of-range value pred
    Table: AUDIENCE Alias: AUDIENCE
    Card: Original: 1200 Rounded: 64 Computed: 63.64 Non Adjusted: 63.64
    -----------------------------------------
    END Single Table Cardinality Estimation
    -----------------------------------------
  • Between: If Out of range
    explain plan set statement_id = '12' for select month_no from audience where month_nobetween 13 and 15;
    explain plan set statement_id = '14' for select month_no from audience where month_nobetween 14 and 16;
    explain plan set statement_id = '15' for select month_no from audience where month_nobetween 15 and 17;
    explain plan set statement_id = '16' for select month_no from audience where month_no between 13 and 20;
    explain plan set statement_id = '17' for select month_no from audience where month_no between 14 and 21;
    explain plan set statement_id = '18' for select month_no from audience where month_no between 15 and 22;
    explain plan set statement_id = '19' for select month_no from audience where month_no between 16 and 23;
  • sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
    STATEMENT_ID CARDINALITY
    ------------ -----------
    12 100
    13 100
    14 100
    15 100
    16 100
    17 100
    18 100
    19 100
    20 100
    9 rows selected.
  • Case: incorrect low/high value
  • SELECT count('1') RECCOUNT
    FROM Test_ILM_INTERACTIONt0
    JOIN Test_ILM_INTERACTION_TYPEt1 ON t0.INTERACTION_TYP =
    t1.INTERACTION_TYP
    JOIN Test_ILM_INTERACTION_REFt3 ON t0.interaction_uuid =
    t3.interaction_uuid
    WHERE t1.IS_VIEWABLE = 1
    AND ((t0.DOMAIN_NME = 'DOMAIN_A') or
    (T0.DOMAIN_NME = 'DOMAIN_B' AND
    T0.APPLICATION_NME = 'APPLICATION_C'))
    AND (t3.REF_CDE = 'BK_NUMBER' AND t3.REF_KEY_VALUE = '2389301444')
    AND t0.INTERACTION_DT BETWEEN
    TO_DATE('01-06-2011 16:00:00', 'DD-MM-YYYY HH24:MI:SS')
    AND
    TO_DATE('16-06-2011 15:59:59', 'DD-MM-YYYY HH24:MI:SS')
  • ---------------------------------------------------------------------+----------------
    | Id | Operation | Name | Rows | Cost |
    ---------------------------------------------------------------------+----------------
    | 0 | SELECT STATEMENT | | | 13 |
    | 1 | SORT AGGREGATE | | 1 | |
    | 2 | NESTED LOOPS | | 1 | 13 |
    | 3 | NESTED LOOPS | | 1 | 12 |
    | 4 | INDEX RANGE SCAN | Test_ILM_INTERACTION_IDX3 | 4 | 4 |
    | 5 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_REF_PK| 1 | 2 |
    | 6 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 |
    | 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 |
    ---------------------------------------------------------------------+----------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    4 - access("T0"."INTERACTION_DT">=TO_DATE('2011-06-01 05:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
    "T0"."INTERACTION_DT"<=TO_DATE('2011-06-16 04:59:59', 'yyyy-mm-dd hh24:mi:ss'))
    filter("T0"."DOMAIN_NME"='DOMAIN_A' OR "T0"."APPLICATION_NME"='APPLICATION_C' AND
    "T0"."DOMAIN_NME"='DOMAIN_B')
    5 - access("T0"."INTERACTION_UUID"="T3"."INTERACTION_UUID" AND "T3"."REF_CDE"='BL_NUMBER' AND
    "T3"."REF_KEY_VALUE"=2389301444')
    6 - filter("T1"."IS_VIEWABLE"=1)
    7 - access("T0"."INTERACTION_TYP"="T1"."INTERACTION_TYP")
  • sys@CS2PRD> @descTestowner.Test_ilm_interaction
    Column Name NUM_DISTINCT DENSITY Low High
    ------------------ ------------ ---------- -------------------- --------------------
    INTERACTION_DT 7583898 1.3186E-07 2010-05-07 23:45:47 2011-05-31 15:31:35
    sys@CS2PRD> @indTestowner.Test_ilm_interaction
    TABLE_NAME INDEX_NAME POS# COLUMN_NAME
    ----------------------- --------------------------- ---- ----------------
    Test_ILM_INTERACTION Test_ILM_INTERACTION_IDX3 1 INTERACTION_DT
    2 COMPANY_ID
    3 APPLICATION_NME
    4 DOMAIN_NME
    5 INTERACTION_TYP
    6 INTERACTION_UUID
  • select to_date(2455714,'J') max_date from dual;
    MAX_DATE
    -------------------
    2011-06-01 00:00:00
    10053 trace
    Column (#4): INTERACTION_DT(DATE)
    AvgLen: 8.00 NDV: 7583898 Nulls: 0 Density: 1.3186e-07 Min: 2455325 Max: 2455714
    SINGLE TABLE ACCESS PATH
    Using prorated density: 1.3151e-07 of col#4 as selectivity of out-of-range value pred
    Table: Test_ILM_INTERACTIONAlias: T0
    Card: Original: 53138344 Rounded: 4 Computed: 3.75 Non Adjusted: 3.75
    Access Path: index (IndexOnly)
    Index: Test_ILM_INTERACTION_IDX3
    resc_io: 4.00 resc_cpu: 29886
    ix_sel: 1.3151e-07 ix_sel_with_filters: 7.0628e-08
    Cost: 4.00 Resp: 4.00 Degree: 1
    ****** trying bitmap/domain indexes ******
    Best:: AccessPath: IndexRange Index: Test_ILM_INTERACTION_IDX3
    Cost: 4.00 Degree: 1 Resp: 4.00 Card: 3.75 Bytes: 0
  • Solution:dbms_stats.set_colum_stats
    m_high := ADD_MONTHS(m_high, 1);
    m_val_array := DBMS_STATS.DATEARRAY(m_low, m_high);
    dbms_stats.prepare_column_values(
    srec => m_statrec,
    datevals => m_val_array
    );
    dbms_stats.set_column_stats(
    ownname => NULL,
    tabname => '&&TABLE_NAME',
    colname => '&&COLUMN_NAME',
    distcnt => m_distcnt,
    density => m_density,
    nullcnt => m_nullcnt,
    srec => m_statrec,
    avgclen => m_avgclen,
    no_invalidate => false
    );
    Pls check the hack_stats_3.sql for the full script
  • sys@CS2PRD> @descTestowner.Test_ilm_interaction
    Column Name NUM_DISTINCT DENSITY Low High
    ------------------ ------------ ---------- -------------------- --------------------
    INTERACTION_DT 7583898 1.3186E-07 2010-05-07 23:45:47 2011-06-31 15:31:35
  • ---------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Cost (%CPU)|
    ---------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 14 (0)|
    | 1 | SORT AGGREGATE | | 1 | |
    | 2 | NESTED LOOPS | | 1 | 14 (0)|
    | 3 | NESTED LOOPS | | 1 | 13 (0)|
    | 4 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION_REF| 1 | 10 (0)|
    |* 5 | INDEX RANGE SCAN | Test_ILM_INTERACTION_REF_IDX1 | 9 | 4 (0)|
    |* 6 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION| 1 | 3 (0)|
    |* 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_PK| 1 | 2 (0)|
    |* 8 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 (0)|
    |* 9 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 (0)|
    ---------------------------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    5 - access("T3"."REF_CDE"='BK_NUMBER' AND "T3"."REF_KEY_VALUE"='2389301444')
    6 - filter("T0"."INTERACTION_DT">=TO_DATE('2011-06-01 16:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
    ("T0"."DOMAIN_NME"='DOMAIN_A' OR "T0"."APPLICATION_NME"='APPLICATION_C' AND
    "T0"."DOMAIN_NME"='DOMAIN_B') AND "T0"."INTERACTION_DT"<=TO_DATE('2011-06-16 15:59:59', 'yyyy-mm-dd
    hh24:mi:ss'))
    7 - access("T0"."INTERACTION_UUID"="T3"."INTERACTION_UUID")
    8 - filter("T1"."IS_VIEWABLE"=1)
    9 - access("T0"."INTERACTION_TYP"="T1"."INTERACTION_TYP")
  • select to_date(2455744,'J') max_date from dual;
    MAX_DATE
    -------------------
    2011-07-01 00:00:00
    10053 trace
    ***************************************
    SINGLE TABLE ACCESS PATH
    Column (#9): DOMAIN_NME(VARCHAR2)
    AvgLen: 12.00 NDV: 10 Nulls: 53692 Density: 9.3545e-09
    Histogram: Freq #Bkts: 10 UncompBkts: 5973 EndPtVals: 10
    Column (#2): APPLICATION_NME(VARCHAR2)
    AvgLen: 17.00 NDV: 30 Nulls: 0 Density: 9.3451e-09
    Histogram: Freq #Bkts: 30 UncompBkts: 5979 EndPtVals: 30
    Column (#4): INTERACTION_DT(DATE)
    AvgLen: 8.00 NDV: 7583898 Nulls: 0 Density: 1.3186e-07 Min: 2455325 Max: 2455744
    Table: Test_ILM_INTERACTIONAlias: T0
    Card: Original: 53138344 Rounded: 1024006 Computed: 1024006.31 Non Adjusted: 1024006.31
    Access Path: index (IndexOnly)
    Index: Test_ILM_INTERACTION_IDX3
    resc_io: 28694.00 resc_cpu: 574312799
    ix_sel: 0.035829 ix_sel_with_filters: 0.019271
    Cost: 28722.69 Resp: 28722.69 Degree: 1
    ****** trying bitmap/domain indexes ******
    Best:: AccessPath: IndexRange Index: Test_ILM_INTERACTION_IDX3
    Cost: 28722.69 Degree: 1 Resp: 28722.69 Card: 1024006.31 Bytes: 0
    ***************************************
  • Performance after tuning
  • The functional scripts
  • ThanksQ&A
  • ---------------------------------------------------------------------+----------------
    | Id | Operation | Name | Rows | Cost |
    ---------------------------------------------------------------------+----------------
    | 0 | SELECT STATEMENT | | | 13 |
    | 1 | SORT AGGREGATE | | 1 | |
    | 2 | NESTED LOOPS | | 1 | 13 |
    | 3 | NESTED LOOPS | | 1 | 12 |
    | 4 | INDEX RANGE SCAN | Test_ILM_INTERACTION_IDX3 | 4 | 4 |
    | 5 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_REF_PK| 1 | 2 |
    | 6 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 |
    | 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 |
    ---------------------------------------------------------------------+----------------
    ---------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Cost (%CPU)|
    ---------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 14 (0)|
    | 1 | SORT AGGREGATE | | 1 | |
    | 2 | NESTED LOOPS | | 1 | 14 (0)|
    | 3 | NESTED LOOPS | | 1 | 13 (0)|
    | 4 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION_REF| 1 | 10 (0)|
    |* 5 | INDEX RANGE SCAN | Test_ILM_INTERACTION_REF_IDX1 | 9 | 4 (0)|
    |* 6 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION| 1 | 3 (0)|
    |* 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_PK| 1 | 2 (0)|
    |* 8 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 (0)|
    |* 9 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 (0)|
    ---------------------------------------------------------------------------------------------
  • B-tree Access Cost
    cost =
    blevel +
    ceiling(leaf_blocks * effective index selectivity) +
    ceiling(clustering_factor * effective table selectivity)
    =
    blevel +
    ceiling(leaf_blocks * ix_sel)+
    ceiling(clustering_factor *ix_sel_with_filter)
  • Index Stats::
    Index: Test_ILM_INTERACTION_IDX3 Col#: 4 10 2 9 7 1
    LVLS: 3 #LB: 800771 #DK: 51629818 LB/K: 1.00 DB/K: 1.00 CLUF: 27495746.00
    Access Path: index (IndexOnly)
    Index: Test_ILM_INTERACTION_IDX3
    resc_io: 4.00 resc_cpu: 29886
    ix_sel: 1.3151e-07 ix_sel_with_filters: 7.0734e-08
    Cost: 4.00 Resp: 4.00 Degree: 1
    sys@CS10G> select 3 + ceil(800771 * 0.00000013151) cost from dual;
    COST
    ----------
    4
    Index only, no need to access table
  • Index Stats::
    Index: Test_ILM_INTERACTION_REF_IDX1 Col#: 2 3
    LVLS: 3 #LB: 872428 #DK: 10567436 LB/K: 1.00 DB/K: 5.00 CLUF: 60057635.00
    Access Path: index (AllEqRange)
    Index: Test_ILM_INTERACTION_REF_IDX1
    resc_io: 10.00 resc_cpu: 74724
    ix_sel: 9.4630e-08 ix_sel_with_filters: 9.4630e-08
    Cost: 10.00 Resp: 10.00 Degree: 1
    sys@CS10G> select 3 + ceil(872428 * 0.00000009463) + ceil(60057535 * 0.00000009463) cost from dual;
    COST
    ----------
    10