Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# CBO Basics: Cardinality

1,662
views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
1,662
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
51
0
Likes
0
Embeds 0
No embeds

No notes for slide
• Picture from Jonathan Lewis’s Book &lt;&lt;Cost based Oracle Fundamentals&gt;&gt;
• ### Transcript

• 1. CBO Basics: Cardinality
Sidney Chen@zha-dba
• 2. Agenda
Cardinality
The Cardinality under Various Prediction
Case study: incorrect high value
• 3. Cardinality
The estimated number of rows
a query is expected to return.
Number of rows in table
x
Predicate selectivity
• 4. Imaging there are 1200 people attend the dba weekly meeting,
They are randomly born in 12 month and comes from 10 citites, given the data the even distributed.
create table audience as
select
mod(rownum-1,12) + 1 month_no,
mod(rownum-1,10) + 1 city_no
from
all_objects
where
rownum <= 1200;
• 5. sid@CS10G> select month_no, count(*) from audience group by month_no order by month_no;
MONTH_NO COUNT(*)
---------- ----------
1 100
2 100
3 100
4 100
5 100
6 100
7 100
8 100
9 100
10 100
11 100
12 100
12 rows selected.
• 6. sid@CS10G> select city_no, count(*) from audience group by city_no order by city_no;
CITY_NO COUNT(*)
---------- ----------
1 120
2 120
3 120
4 120
5 120
6 120
7 120
8 120
9 120
10 120
10 rows selected.
• 7. sid@CS10G> @indsid.audience;
OWNER TABLE_NAME BLOCKS NUM_ROWS AVG_ROW_LEN
----- ---------- ---------- ---------- -----------
SID AUDIENCE 5 1200 6
sid@CS10G> @descsid.audience;
Column Name NUM_DISTINCT DENSITY NUM_BUCKETS Low High
------------ ------------ ---------- ----------- ---- ----
MONTH_NO 12 .083333333 1 1 12
CITY_NO 10 .1 1 1 10
• 8. Critical info
NDK: number of distinct keys
Density = 1/NDK, (0.1 = 1/10) (0.083333333 = 1/12)
NUM_BUCKETS=1, there is no histogram gather
NUM_BUCKETS > 1, histogram gathered
• 9. select month_no from audience where month_no=12;
Cardinality
1200 * (1/12) = 100
• 10. sid@CS10G> select month_no from audience where month_no=12;
100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2423062965
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 300 | 3 (0)|
|* 1 | TABLE ACCESS FULL| AUDIENCE | 100 | 300 | 3 (0)|
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("MONTH_NO"=12)
• 11. select month_no from audience where city_no=1;
Cardinality
1200 * (1/10) = 120
• 12. sid@CS10G> select month_no from audience where city_no=1;
120 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2423062965
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 120 | 720 | 3 (0)|
|* 1 | TABLE ACCESS FULL| AUDIENCE | 120 | 720 | 3 (0)|
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CITY_NO"=1)
• 13. select month_no from audience where month_no > 9;
Cardinality
1200 * ( (12-9)/(12-1) ) = 327
• 14. sid@CS10G> select month_no from audience where month_no > 9;
300 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2423062965
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 327 | 981 | 3 (0)|
|* 1 | TABLE ACCESS FULL| AUDIENCE | 327 | 981 | 3 (0)|
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("MONTH_NO">9)
• 15. Equality: If Out of range
explain plan set statement_id = '12' for select month_no from audience where month_no = 12;
explain plan set statement_id = '13' for select month_no from audience where month_no = 13;
explain plan set statement_id = '14' for select month_no from audience where month_no = 14;
explain plan set statement_id = '15' for select month_no from audience where month_no = 15;
explain plan set statement_id = '16' for select month_no from audience where month_no = 16;
explain plan set statement_id = '17' for select month_no from audience where month_no = 17;
explain plan set statement_id = '18' for select month_no from audience where month_no = 18;
explain plan set statement_id = '19' for select month_no from audience where month_no = 19;
explain plan set statement_id = '20' for select month_no from audience where month_no = 20;
• 16. sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
STATEMENT_ID CARDINALITY
------------ -----------
12 100
13 91
14 82
15 73
1664
17 55
18 45
19 36
20 27
9 rows selected.
• 17. Range: If Out of range
explain plan set statement_id = '12' for select month_no from audience where month_no > 12;
explain plan set statement_id = '13' for select month_no from audience where month_no > 13;
explain plan set statement_id = '14' for select month_no from audience where month_no > 14;
explain plan set statement_id = '15' for select month_no from audience where month_no > 15;
explain plan set statement_id = '16' for select month_no from audience where month_no > 16;
explain plan set statement_id = '17' for select month_no from audience where month_no > 17;
explain plan set statement_id = '18' for select month_no from audience where month_no > 18;
explain plan set statement_id = '19' for select month_no from audience where month_no > 19;
explain plan set statement_id = '20' for select month_no from audience where month_no > 20;
• 18. sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
STATEMENT_ID CARDINALITY
------------ -----------
12 100
13 91
14 82
15 73
1664
17 55
18 45
19 36
20 27
9 rows selected.
• 19. The far from low/high range, the less you are to find data
If you have sequence, or time-based column in predicate(such as last modified date:last_mod_dt), and haven’t been keeping the statistics up to date
The Cardinality will drop as time passes, if you using equality and range on that column
• 21. sid@CS10G> select month_no from audience where month_no > 16;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2423062965
-------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 3 | |
| 1 | TABLE ACCESS FULL | AUDIENCE| 64 | 192 | 3 | 00:00:01 |
-------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - filter("MONTH_NO"=16)
• 22. Using 10053 event to confirm
sid@CS10G> @53on
alter session set events '10053 trace name context forever, level 1';
Session altered.
sid@CS10G> explain plan for select month_no from audience where month_no > 16;
Explained.
sid@CS10G> @53off
sid@CS10G> @tracefile
TRACEFILE
----------------------------------------------------------------------------------------------------
• 23. SINGLE TABLE ACCESS PATH
-----------------------------------------
BEGIN Single Table Cardinality Estimation
-----------------------------------------
Column (#1): MONTH_NO(NUMBER)
AvgLen: 3.00 NDV: 12 Nulls: 0 Density: 0.083333 Min: 1 Max: 12
Using prorated density: 0.05303 of col #1 as selectivity of out-of-range value pred
Table: AUDIENCE Alias: AUDIENCE
Card: Original: 1200 Rounded: 64 Computed: 63.64 Non Adjusted: 63.64
-----------------------------------------
END Single Table Cardinality Estimation
-----------------------------------------
• 24. Between: If Out of range
explain plan set statement_id = '12' for select month_no from audience where month_nobetween 13 and 15;
explain plan set statement_id = '14' for select month_no from audience where month_nobetween 14 and 16;
explain plan set statement_id = '15' for select month_no from audience where month_nobetween 15 and 17;
explain plan set statement_id = '16' for select month_no from audience where month_no between 13 and 20;
explain plan set statement_id = '17' for select month_no from audience where month_no between 14 and 21;
explain plan set statement_id = '18' for select month_no from audience where month_no between 15 and 22;
explain plan set statement_id = '19' for select month_no from audience where month_no between 16 and 23;
• 25. sid@CS10G> select statement_id, cardinality from plan_table where id=1 order by statement_id;
STATEMENT_ID CARDINALITY
------------ -----------
12 100
13 100
14 100
15 100
16 100
17 100
18 100
19 100
20 100
9 rows selected.
• 26. Case: incorrect low/high value
• 27. SELECT count('1') RECCOUNT
FROM Test_ILM_INTERACTIONt0
JOIN Test_ILM_INTERACTION_TYPEt1 ON t0.INTERACTION_TYP =
t1.INTERACTION_TYP
JOIN Test_ILM_INTERACTION_REFt3 ON t0.interaction_uuid =
t3.interaction_uuid
WHERE t1.IS_VIEWABLE = 1
AND ((t0.DOMAIN_NME = 'DOMAIN_A') or
(T0.DOMAIN_NME = 'DOMAIN_B' AND
T0.APPLICATION_NME = 'APPLICATION_C'))
AND (t3.REF_CDE = 'BK_NUMBER' AND t3.REF_KEY_VALUE = '2389301444')
AND t0.INTERACTION_DT BETWEEN
TO_DATE('01-06-2011 16:00:00', 'DD-MM-YYYY HH24:MI:SS')
AND
TO_DATE('16-06-2011 15:59:59', 'DD-MM-YYYY HH24:MI:SS')
• 28. ---------------------------------------------------------------------+----------------
| Id | Operation | Name | Rows | Cost |
---------------------------------------------------------------------+----------------
| 0 | SELECT STATEMENT | | | 13 |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 1 | 13 |
| 3 | NESTED LOOPS | | 1 | 12 |
| 4 | INDEX RANGE SCAN | Test_ILM_INTERACTION_IDX3 | 4 | 4 |
| 5 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_REF_PK| 1 | 2 |
| 6 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 |
| 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 |
---------------------------------------------------------------------+----------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T0"."INTERACTION_DT">=TO_DATE('2011-06-01 05:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
"T0"."INTERACTION_DT"<=TO_DATE('2011-06-16 04:59:59', 'yyyy-mm-dd hh24:mi:ss'))
filter("T0"."DOMAIN_NME"='DOMAIN_A' OR "T0"."APPLICATION_NME"='APPLICATION_C' AND
"T0"."DOMAIN_NME"='DOMAIN_B')
5 - access("T0"."INTERACTION_UUID"="T3"."INTERACTION_UUID" AND "T3"."REF_CDE"='BL_NUMBER' AND
"T3"."REF_KEY_VALUE"=2389301444')
6 - filter("T1"."IS_VIEWABLE"=1)
7 - access("T0"."INTERACTION_TYP"="T1"."INTERACTION_TYP")
• 29. sys@CS2PRD> @descTestowner.Test_ilm_interaction
Column Name NUM_DISTINCT DENSITY Low High
------------------ ------------ ---------- -------------------- --------------------
INTERACTION_DT 7583898 1.3186E-07 2010-05-07 23:45:47 2011-05-31 15:31:35
sys@CS2PRD> @indTestowner.Test_ilm_interaction
TABLE_NAME INDEX_NAME POS# COLUMN_NAME
----------------------- --------------------------- ---- ----------------
Test_ILM_INTERACTION Test_ILM_INTERACTION_IDX3 1 INTERACTION_DT
2 COMPANY_ID
3 APPLICATION_NME
4 DOMAIN_NME
5 INTERACTION_TYP
6 INTERACTION_UUID
• 30. select to_date(2455714,'J') max_date from dual;
MAX_DATE
-------------------
2011-06-01 00:00:00
10053 trace
Column (#4): INTERACTION_DT(DATE)
AvgLen: 8.00 NDV: 7583898 Nulls: 0 Density: 1.3186e-07 Min: 2455325 Max: 2455714
SINGLE TABLE ACCESS PATH
Using prorated density: 1.3151e-07 of col#4 as selectivity of out-of-range value pred
Table: Test_ILM_INTERACTIONAlias: T0
Card: Original: 53138344 Rounded: 4 Computed: 3.75 Non Adjusted: 3.75
Access Path: index (IndexOnly)
Index: Test_ILM_INTERACTION_IDX3
resc_io: 4.00 resc_cpu: 29886
ix_sel: 1.3151e-07 ix_sel_with_filters: 7.0628e-08
Cost: 4.00 Resp: 4.00 Degree: 1
****** trying bitmap/domain indexes ******
Best:: AccessPath: IndexRange Index: Test_ILM_INTERACTION_IDX3
Cost: 4.00 Degree: 1 Resp: 4.00 Card: 3.75 Bytes: 0
• 31. Solution:dbms_stats.set_colum_stats
m_val_array := DBMS_STATS.DATEARRAY(m_low, m_high);
dbms_stats.prepare_column_values(
srec => m_statrec,
datevals => m_val_array
);
dbms_stats.set_column_stats(
ownname => NULL,
tabname => '&&TABLE_NAME',
colname => '&&COLUMN_NAME',
distcnt => m_distcnt,
density => m_density,
nullcnt => m_nullcnt,
srec => m_statrec,
avgclen => m_avgclen,
no_invalidate => false
);
Pls check the hack_stats_3.sql for the full script
• 32. sys@CS2PRD> @descTestowner.Test_ilm_interaction
Column Name NUM_DISTINCT DENSITY Low High
------------------ ------------ ---------- -------------------- --------------------
INTERACTION_DT 7583898 1.3186E-07 2010-05-07 23:45:47 2011-06-31 15:31:35
• 33. ---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)|
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 (0)|
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 1 | 14 (0)|
| 3 | NESTED LOOPS | | 1 | 13 (0)|
| 4 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION_REF| 1 | 10 (0)|
|* 5 | INDEX RANGE SCAN | Test_ILM_INTERACTION_REF_IDX1 | 9 | 4 (0)|
|* 6 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION| 1 | 3 (0)|
|* 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_PK| 1 | 2 (0)|
|* 8 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 (0)|
|* 9 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 (0)|
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T3"."REF_CDE"='BK_NUMBER' AND "T3"."REF_KEY_VALUE"='2389301444')
6 - filter("T0"."INTERACTION_DT">=TO_DATE('2011-06-01 16:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
("T0"."DOMAIN_NME"='DOMAIN_A' OR "T0"."APPLICATION_NME"='APPLICATION_C' AND
"T0"."DOMAIN_NME"='DOMAIN_B') AND "T0"."INTERACTION_DT"<=TO_DATE('2011-06-16 15:59:59', 'yyyy-mm-dd
hh24:mi:ss'))
7 - access("T0"."INTERACTION_UUID"="T3"."INTERACTION_UUID")
8 - filter("T1"."IS_VIEWABLE"=1)
9 - access("T0"."INTERACTION_TYP"="T1"."INTERACTION_TYP")
• 34. select to_date(2455744,'J') max_date from dual;
MAX_DATE
-------------------
2011-07-01 00:00:00
10053 trace
***************************************
SINGLE TABLE ACCESS PATH
Column (#9): DOMAIN_NME(VARCHAR2)
AvgLen: 12.00 NDV: 10 Nulls: 53692 Density: 9.3545e-09
Histogram: Freq #Bkts: 10 UncompBkts: 5973 EndPtVals: 10
Column (#2): APPLICATION_NME(VARCHAR2)
AvgLen: 17.00 NDV: 30 Nulls: 0 Density: 9.3451e-09
Histogram: Freq #Bkts: 30 UncompBkts: 5979 EndPtVals: 30
Column (#4): INTERACTION_DT(DATE)
AvgLen: 8.00 NDV: 7583898 Nulls: 0 Density: 1.3186e-07 Min: 2455325 Max: 2455744
Table: Test_ILM_INTERACTIONAlias: T0
Card: Original: 53138344 Rounded: 1024006 Computed: 1024006.31 Non Adjusted: 1024006.31
Access Path: index (IndexOnly)
Index: Test_ILM_INTERACTION_IDX3
resc_io: 28694.00 resc_cpu: 574312799
ix_sel: 0.035829 ix_sel_with_filters: 0.019271
Cost: 28722.69 Resp: 28722.69 Degree: 1
****** trying bitmap/domain indexes ******
Best:: AccessPath: IndexRange Index: Test_ILM_INTERACTION_IDX3
Cost: 28722.69 Degree: 1 Resp: 28722.69 Card: 1024006.31 Bytes: 0
***************************************
• 35. Performance after tuning
• 36. The functional scripts
• 37. ThanksQ&A
• 38. ---------------------------------------------------------------------+----------------
| Id | Operation | Name | Rows | Cost |
---------------------------------------------------------------------+----------------
| 0 | SELECT STATEMENT | | | 13 |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 1 | 13 |
| 3 | NESTED LOOPS | | 1 | 12 |
| 4 | INDEX RANGE SCAN | Test_ILM_INTERACTION_IDX3 | 4 | 4 |
| 5 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_REF_PK| 1 | 2 |
| 6 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 |
| 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 |
---------------------------------------------------------------------+----------------
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)|
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 (0)|
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 1 | 14 (0)|
| 3 | NESTED LOOPS | | 1 | 13 (0)|
| 4 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION_REF| 1 | 10 (0)|
|* 5 | INDEX RANGE SCAN | Test_ILM_INTERACTION_REF_IDX1 | 9 | 4 (0)|
|* 6 | TABLE ACCESS BY INDEX ROWID| Test_ILM_INTERACTION| 1 | 3 (0)|
|* 7 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_PK| 1 | 2 (0)|
|* 8 | TABLE ACCESS BY INDEX ROWID | Test_ILM_INTERACTION_TYPE| 1 | 1 (0)|
|* 9 | INDEX UNIQUE SCAN | Test_ILM_INTERACTION_TYPE_PK| 1 | 0 (0)|
---------------------------------------------------------------------------------------------
• 39. B-tree Access Cost
cost =
blevel +
ceiling(leaf_blocks * effective index selectivity) +
ceiling(clustering_factor * effective table selectivity)
=
blevel +
ceiling(leaf_blocks * ix_sel)+
ceiling(clustering_factor *ix_sel_with_filter)
• 40. Index Stats::
Index: Test_ILM_INTERACTION_IDX3 Col#: 4 10 2 9 7 1
LVLS: 3 #LB: 800771 #DK: 51629818 LB/K: 1.00 DB/K: 1.00 CLUF: 27495746.00
Access Path: index (IndexOnly)
Index: Test_ILM_INTERACTION_IDX3
resc_io: 4.00 resc_cpu: 29886
ix_sel: 1.3151e-07 ix_sel_with_filters: 7.0734e-08
Cost: 4.00 Resp: 4.00 Degree: 1
sys@CS10G> select 3 + ceil(800771 * 0.00000013151) cost from dual;
COST
----------
4
Index only, no need to access table
• 41. Index Stats::
Index: Test_ILM_INTERACTION_REF_IDX1 Col#: 2 3
LVLS: 3 #LB: 872428 #DK: 10567436 LB/K: 1.00 DB/K: 5.00 CLUF: 60057635.00
Access Path: index (AllEqRange)
Index: Test_ILM_INTERACTION_REF_IDX1
resc_io: 10.00 resc_cpu: 74724
ix_sel: 9.4630e-08 ix_sel_with_filters: 9.4630e-08
Cost: 10.00 Resp: 10.00 Degree: 1
sys@CS10G> select 3 + ceil(872428 * 0.00000009463) + ceil(60057535 * 0.00000009463) cost from dual;
COST
----------
10