Checking clustering factor to detect row migration

Today, while checking PRODUTION DB, I saw 1 SQL_ID with 2 different execution plans as below:

SQL Plan
Plan hash value: 561816070
------------------------------------------------------------------------| Id | Operation
| Name
| Rows | Bytes | Cost |
------------------------------------------------------------------------|
0 | UPDATE STATEMENT
|
|
|
|
78 |
|
1 | UPDATE
| TBL_XXX|
|
|
|
|
2 |
TABLE ACCESS FULL| TBL_XXX|
236 | 14632 |
78 |
------------------------------------------------------------------------Plan hash value: 2757418408
-------------------------------------------------------------------------------------| Id | Operation
| Name
-------------------------------------------------------------------------------------|
|
|
|
|
1 |
|
1 | UPDATE
| TBL_XXX
|
|
|
|
|
2 |
TABLE ACCESS BY INDEX ROWID| TBL_XXX
|
1 |
72 |
1 |
|
3 |
INDEX RANGE SCAN
| TBL_XXX_PK |
1 |
|
1 |
--------------------------------------------------------------------------------------

I/O Comparison for SELECT Statement
SQL> select /*+ full(TBL_XXX) */ * from TBL_XXX where site_id=234;
8996 rows selected.
Elapsed: 00:00:00.51
Execution Plan
---------------------------------------------------------Plan hash value: 1988479474
-----------------------------------------------------------------------| Id | Operation
| Name
-----------------------------------------------------------------------|
0 | SELECT STATEMENT |
| 8463 |
520K|
493 |
|* 1 | TABLE ACCESS FULL| TBL_XXX
| 8463 |
520K|
493 |
-----------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------1 - filter("SITE_ID"=234)
Statistics
---------------------------------------------------------1 recursive calls

0
8668
0
0
542896
7081
601
0
0
8996

db block gets
consistent gets
physical reads
redo size
bytes sent via SQL*Net to client
bytes received via SQL*Net from client
SQL*Net roundtrips to/from client
sorts (memory)
sorts (disk)
rows processed

SQL> select /*+ index(TBL_XXX,TBL_XXX_pk) */ * from TBL_XXX where site_id=234;
8996 rows selected.
Elapsed: 00:00:01.38
Execution Plan
---------------------------------------------------------Plan hash value: 2363445147
------------------------------------------------------------------------------------| Id | Operation
| Name
------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
|
| 8463 |
520K|
499 |
|
1 | TABLE ACCESS BY INDEX ROWID| TBL_XXX
| 8463 |
520K|
499 |
|* 2 |
INDEX RANGE SCAN
| TBL_XXX_PK | 8463 |
|
5 |
------------------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------2 - access("SITE_ID"=234)
Statistics
---------------------------------------------------------1 recursive calls
0 db block gets
6179 consistent gets
67 physical reads
0 redo size
534539 bytes sent via SQL*Net to client
7081 bytes received via SQL*Net from client
601 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
8996 rows processed

If we see the average execution time, this is fine (still in milliseconds). And also it doesn’t execute too
frequently (232 times in every 15 minutes).
First plan (2757418408) uses Index Range Scan on its PK while the other (561816070) uses FTS.
Checking the data dictionary, I came to know that the FTS is being used by the tables which have statistics
while the Index Range Scan is being used by the tables with no statistics on it.

Table and Index Statistics
SQL> select owner,clustering_factor,NUM_ROWS,LEAF_BLOCKS,DISTINCT_KEYS,last_analyzed from
dba_indexes where index_name='TBL_XXX_1IX';
OWNER
CLUSTERING_FACTOR
NUM_ROWS LEAF_BLOCKS DISTINCT_KEYS LAST_ANAL
------------------------------ ----------------- ---------- ----------- ------------- --------SCHEMA1
21490
592461
5606
592461 11-MAR-13
SCHEMA2
SCHEMA3
SCHEMA4
SCHEMA5

SCHEMA6
SCHEMA7
SCHEMA8
SCHEMA9
SCHEMA10
SCHEMA11

105009
8853

3179805
343400

26048
887

3179805 08-NOV-12
343400 15-SEP-11

11 rows selected.
SQL> select owner,clustering_factor,NUM_ROWS,LEAF_BLOCKS,DISTINCT_KEYS,last_analyzed from
dba_indexes where index_name='TBL_XXX_PK';
OWNER
CLUSTERING_FACTOR
------------------------------ ----------------- ---------- ----------- ------------- --------SCHEMA1
335430
575315
2986
575315 11-MAR-13
SCHEMA2
53205
118276
502
118276 30-JUN-10
SCHEMA3
SCHEMA4
SCHEMA5
SCHEMA6
1783417
3093008
13264
3093008 08-NOV-12
SCHEMA7
53205
118276
502
118276 30-JUN-10
SCHEMA8
8 rows selected.
SQL> select OWNER,BLOCKS,NUM_ROWS,LAST_ANALYZED from dba_tables where table_name='TBL_XXX';
OWNER
BLOCKS
NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------- --------SCHEMA10
SCHEMA9
SCHEMA8
SCHEMA7
SCHEMA6
1258
235533 30-JUN-10
SCHEMA5
27445
3122013 08-NOV-12
SCHEMA4
SCHEMA3
SCHEMA2
SCHEMA2
SCHEMA1
8065
575315 11-MAR-13

I was trying to solve this issue by creating SQL Profile (using my normal template) for this SQL_ID to force Index
Range Scan but it doesn’t workfor SCHEMA1account 
Based on below data distribution, index scan should be much more efficient (it will scan approx. 1.5%of total
population; there are 64 distinct values for SITE_ID with “evenly” distributed data)
Also I have updated the statistics but still it doesn’t work
The question is WHY???

Data Distribution
SQL> select SITE_ID,count(*) from schema1.TBL_XXX group by SITE_ID;
SITE_ID
COUNT(*)
---------- ---------100
8987
201
8993
202
8993
203
8993
204
8992
...cut here to safe the space...
257
258
289
292
295
1000

8987
8987
8987
8987
8987
1

65 rows selected.

SQL Profile
begin
dbms_sqltune.import_sql_profile(
name =>'tbl_xxx_063hrvz3s17vk',
sql_text =>
'UPDATE TBL_XXX SET EXP_DATE = :closingTime, SYS_UPDATE_DATE =
sysdate,OPERATOR_ID = :OPERATOR_ID,APPLICATION_ID =
:APPLICATION_ID,DL_SERVICE_CODE = :DL_SERVICE_CODE,DL_UPDATE_STAMP =
:DL_UPDATE_STAMP WHERE EXP_DATE IS NULL AND SITE_ID = :siteID',
profile => sqlprof_attr(
'INDEX_RS_ASC(@"UPD$1" "TBL_XXX"@"UPD$1"',
'
("TBL_XXX"."SITE_ID" "TBL_XXX"."INTERVAL_ID"))',
'OUTLINE_LEAF(@"UPD$1")',
'IGNORE_OPTIM_EMBEDDED_HINTS'),
force_match => TRUE
);
end;
/

SQL>conn schema1/passwd
Connected.
SQL> explain plan for
2 UPDATE TBL_XXX SET EXP_DATE = :closingTime, SYS_UPDATE_DATE =
3 sysdate,OPERATOR_ID = :OPERATOR_ID,APPLICATION_ID =
4 :APPLICATION_ID,DL_SERVICE_CODE = :DL_SERVICE_CODE,DL_UPDATE_STAMP =
5 :DL_UPDATE_STAMP WHERE EXP_DATE IS NULL AND SITE_ID = :siteID;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------Plan hash value: 561816070
------------------------------------------------------------------------| Id | Operation
| Name
------------------------------------------------------------------------|
|
|
1 |
37 |
493 |
|
1 | UPDATE
| TBL_XXX |
|
|
|
|* 2 |
TABLE ACCESS FULL| TBL_XXX |
1 |
37 |
493 |
------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------2 - filter("EXP_DATE" IS NULL AND "SITE_ID"=TO_NUMBER(:SITEID))
Note
----- cpu costing is off (consider enabling it)
- SQL profile "tbl_xxx_063hrvz3s17vk" used for this statement

Going back to the table and index statistics, I am curious with the value of clustering_factorand it gives me
some clues.
It is the number time (if you are subscribing for “Baby First TV” channel, then you will familiar with that
song )
OWNER
---------SCHEMA1
SCHEMA1

INDEX_NAME
CLUSTERING_FACTOR
---------------------- ----------------- ---------- ----------- ------------- --------TBL_XXX_1IX
21490
592461
5606
592461 11-MAR-13
TBL_XXX_PK
335430
575315
2986
575315 11-MAR-13

OWNER
TABLE_NAME
BLOCKS
NUM_ROWS LAST_ANAL
---------- ---------------------- ---------- ---------- --------SCHEMA1
TBL_XXX
8065
575315 11-MAR-13

There are 8,065 blocks in TBL_XXX with 575,315 rows (approx. 71 rows per block)
If we take a look on the index statistics, the clustering_factor of TBL_XXX_PK is 335,430 (it’s close to the
number of rows rather than blocks).
P.S. For the clustering_factor topic, I have explained it (very small pieces) in my “Introduction to Oracle
Optimizer” PPT or better googling it.
What I can say after seeing those statistics is that the data in the table are scattered from TBL_XXX_PK point of
few.
The side effect is that Oracle think it will requires more I/O to get the data when we use index access compare
to FTS (you can confirmed that by looking at the output from below explain plan, the cost of FTS [493] is
smaller than cost of Index Range Scan [530]). The word think here is referring to Oracle cost based optimizer
calculation. You can check how the COST is being calculated for FTS and Index Range Scan in the web (googling)
or my above PPT.

Explain Plan Output
SQL>
2
3
4
5
6

explain plan for
UPDATE /*+ index(TBL_XXX, TBL_XXX_PK) */ TBL_XXX
SET EXP_DATE = :closingTime, SYS_UPDATE_DATE =
:DL_UPDATE_STAMP WHERE EXP_DATE IS NULL AND SITE_ID = :siteID;

Explained.
Elapsed: 00:00:00.05
SQL> select * from table(dbms_xplan.display(null,null,'ADVANCED'));
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------Plan hash value: 2757418408
-------------------------------------------------------------------------------------| Id | Operation
| Name
-------------------------------------------------------------------------------------|
|
|
1 |
37 |
530 |
|
1 | UPDATE
| TBL_XXX
|
|
|
|
|* 2 |
|
1 |
37 |
530 |
|* 3 |
INDEX RANGE SCAN
| TBL_XXX_PK | 8989 |
|
5 |
--------------------------------------------------------------------------------------

2 UPDATE TBL_XXX
3 SET EXP_DATE = :closingTime, SYS_UPDATE_DATE =
4 sysdate,OPERATOR_ID = :OPERATOR_ID,APPLICATION_ID =
5 :APPLICATION_ID,DL_SERVICE_CODE = :DL_SERVICE_CODE,DL_UPDATE_STAMP =
6 :DL_UPDATE_STAMP WHERE EXP_DATE IS NULL AND SITE_ID = :siteID;
select * from table(dbms_xplan.display);
Explained.
Elapsed: 00:00:00.04
SQL>
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------Plan hash value: 561816070
------------------------------------------------------------------------| Id | Operation
| Name
-------------------------------------------------------------------------

|
|
|
1 |
37 |
493 |
|
1 | UPDATE
| TBL_XXX|
|
|
|
|* 2 |
TABLE ACCESS FULL| TBL_XXX|
1 |
37 |
493 |
-------------------------------------------------------------------------

For the temporary solution, I have recreated the SQL profile using additional information as below yellow
highlighted (setting optimizer_index_cost_adj parameter to half of previous value, it was 10) to tell Oracle that
index access is half expensive than before 
And cost is reduced from 530 (Test Cast section) to 265 (green highlighted)
Simple math will be (5 / 10) * 530 = 265
Where:
5 is new value of optimizer_index_cost_adj
10 is previous value of optimizer_index_cost_adj
530 is previous COST of Index Range Scan
exec dbms_sqltune.drop_sql_profile('tbl_xxx_063hrvz3s17vk');
/
begin
dbms_sqltune.import_sql_profile(
name => 'tbl_xxx_063hrvz3s17vk',
sql_text =>
'UPDATE TBL_XXX SET EXP_DATE = :closingTime, SYS_UPDATE_DATE =
:DL_UPDATE_STAMP WHERE EXP_DATE IS NULL AND SITE_ID = :siteID',
profile => sqlprof_attr(
'INDEX_RS_ASC(@"UPD$1" "TBL_XXX"@"UPD$1"',
'
("TBL_XXX"."SITE_ID" "TBL_XXX"."INTERVAL_ID"))',
'OUTLINE_LEAF(@"UPD$1")',
'OPT_PARAM(''optimizer_index_cost_adj'' 5)',
'IGNORE_OPTIM_EMBEDDED_HINTS')
);
end;
SQL> conn schema1/passwd
2
UPDATE TBL_XXX
3
SET EXP_DATE = :closingTime, SYS_UPDATE_DATE =
4
5
6
:DL_UPDATE_STAMP WHERE EXP_DATE IS NULL AND SITE_ID = :siteID;
SQL> select * from table(dbms_xplan.display);
Explained.
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------Plan hash value: 2757418408
-------------------------------------------------------------------------------------| Id | Operation
| Name
-------------------------------------------------------------------------------------|
|
|
1 |
37 |
265 |
|
1 | UPDATE
| TBL_XXX
|
|
|
|
|* 2 |
|
1 |
37 |
265 |
|* 3 |
INDEX RANGE SCAN
| TBL_XXX_PK | 8989 |
|
3 |
--------------------------------------------------------------------------------------

BUT, setting optimizer_index_cost_adj to a very low value (as what I did), for.ex. 5 is not “a proper way” if we
don’t have any idea about data distribution. Of course we can set the value to 6 or 7 or any value less than 10
only to reduce the calculated COST of Index Scan.
So I will continue my investigation by focusing on the huge value of clustering factor.

When I first time see that huge clustering factor, there is only one thing in my mind, this is a common issue
with application design (table structure, red) and we might hit “ROW MIGRATION” issue
How come a Primary Key has a huge clustering factor, it should be less (close to number of block) if we design it
“properly” (pctree, initrans, etc)
This table is having “default” value of PCTFREE (10) and seems the application is behaving like this (I am
guessing here, so need to check with application team):
- The initial insert only set a value for “several” column (NULL for almost not-null columns)
- Next update will set the “previous-empty” column to some value
With above behavior, a lot of “short anduncompleted” rows will be there in every single block and once the
UPDATE comes, that block will not be sufficient anymore to hold that row and then the row will be migrated
into other blocks.
Let’s the “NUMBER” confirmed it by doing below test case. The purpose of this case is to check the value of 2
system/ session statistics: “table fetch by rowid” and “table fetch continued row”
1. First I will check the initial value of below 2 statistics (maroon colored font)
2. Then I will do FTS on TBL_XXX and some “small” increment on the “table fetch continued row”,
probably we have chained-row here (green colored font)
3. Finally, I force Index Range Scan on TBL_XXX_PK and the “table fetch continued row” increase from 14
to 217 (blue colored font).
It confirmed that there are, at least, approx. 200 rows with more than 1 block (migrated)

The Test Case
Initial Value
SQL>
2
3
4

select name,class,value
from v$sesstat a,v$statname b
where a.STATISTIC#=b.STATISTIC# and name like 'table fetch%'
and sid=9868;

NAME
CLASS
VALUE
---------------------------------------------- ---------- ---------table fetch by rowid
64
64
table fetch continued row
64
0

Full Table Scan
SQL> select * from TBL_XXX where site_id=100;
9129 rows selected.
Execution Plan
---------------------------------------------------------Plan hash value: 1988479474
-----------------------------------------------------------------------| Id | Operation
| Name
-----------------------------------------------------------------------|
0 | SELECT STATEMENT |
| 9751 |
599K|
493 |
|* 1 | TABLE ACCESS FULL| TBL_XXX | 9751 |
599K|
493 |
-----------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------1 - filter("SITE_ID"=100)

Statistics
---------------------------------------------------------267 recursive calls
0 db block gets
4744 physical reads
0 redo size
1 sorts (memory)
0 sorts (disk)
9129 rows processed
SQL>
2
3
4

and sid=9868;

NAME
CLASS
VALUE
64
1014
64
14

Index Range Scan
SQL> select /*+ index(TBL_XXX, TBL_XXX_pk) */ * from TBL_XXX where site_id=100;
9129 rows selected.
Execution Plan
---------------------------------------------------------Plan hash value: 2363445147
------------------------------------------------------------------------------------| Id | Operation
| Name
------------------------------------------------------------------------------------|
|
| 9751 |
599K|
574 |
|
1 | TABLE ACCESS BY INDEX ROWID| TBL_XXX
| 9751 |
599K|
574 |
|* 2 |
INDEX RANGE SCAN
| TBL_XXX_PK | 9751 |
|
6 |
------------------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------2 - access("SITE_ID"=100)
Statistics
---------------------------------------------------------1 recursive calls
0 db block gets
591 physical reads
0 redo size
0 sorts (memory)
0 sorts (disk)
9129 rows processed
SQL>
2
3
4

and sid=9868;

NAME
CLASS
VALUE
64
10143
64
217

From Oracle documentation you will get the explanation of those 2 statistics:

The sensible solution (align with above documentation):
1. Alter or reorganize the table by specifying bigger value for PCT_FREE (this will give more room for any
rows to grow, because of UPDATE statement)
2. Create backup of TBL_XXX, truncate the original table and reinsert the data (ORDER BY SITE_ID) so that
data is ordered by SITE_ID, then clustering factor of the Primary Key will be reduced
3. Rebuild the indexes
If we only move/rebuild the table/index without changing PCT_FREE, there is a chance that this issue will
happened again in the future.
In the below Final Result, we can see good result of this tuning:
- I/O is decreased from 6K to 2K
- No row migration (from the value of “table fetch continued row” statistic)
- Without any SQL Hint or SQL Profile, it goes for Index Range San

Proposed Permanent Solution
Size of PCTFREE should be calculated based on block size and average row length (all columns should be filled),
so that we can have good value for average row per block. Below “PCTFREE 10” is default value and being used
for testing only.
SQL> create table dbc_TBL_XXXpctfree 10as select * from schema1.TBL_XXX order by site_id;
Table created.
SQL> create index idx_dbc_TBL_XXX on dbc_TBL_XXX(site_id, interval_id);
Index created.
SQL> exec dbms_stats.gather_table_stats(USER, 'DBC_TBL_XXX');
PL/SQL procedure successfully completed.
SQL> select clustering_factor, last_analyzed from user_indexes where
index_name='IDX_DBC_TBL_XXX';

CLUSTERING_FACTOR LAST_ANAL
----------------- --------107754 17-MAR-13
SQL> select blocks,num_rows,last_analyzed from user_tables where table_name='DBC_TBL_XXX';
BLOCKS
NUM_ROWS LAST_ANAL
---------- ---------- --------5579
572435 17-MAR-13

SQL> select * from dbc_TBL_XXX where site_id=100;
8941 rows selected.
Execution Plan
-----------------------------------------------------------------------------------------| Id | Operation
| Name
-----------------------------------------------------------------------------------------|
|
| 8944 |
550K|
172 |
|
1 | TABLE ACCESS BY INDEX ROWID| DBC_TBL_XXX| 8944 |
550K|
172 |
|* 2 |
INDEX RANGE SCAN
| IDX_DBC_TBL_XXX| 8944 |
|
3 |
-----------------------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------2 - access("SITE_ID"=100)
Statistics
---------------------------------------------------------1 recursive calls
0 db block gets
3 physical reads
0 redo size
0 sorts (memory)
0 sorts (disk)
8941 rows processed

Final Result
SQL> select distinct sid from v$mystat;
SID
---------12359
SQL>
SQL>
SQL>
2
3
4

set pages 200 lines 200
col name for a35
and sid=12359;

NAME
CLASS
VALUE
----------------------------------- ---------- ---------table fetch by rowid
64
15
64
0
SQL> select * from dbc_TBL_XXX where site_id=100;
8941 rows selected.
Execution Plan
----------------------------------------------------------

-----------------------------------------------------------------------------------------| Id | Operation
| Name
-----------------------------------------------------------------------------------------|
|
| 8944 |
550K|
172 |
|
1 | TABLE ACCESS BY INDEX ROWID| DBC_TBL_XXX| 8944 |
550K|
172 |
|* 2 |
INDEX RANGE SCAN
| IDX_DBC_TBL_XXX| 8944 |
|
3 |
-----------------------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------2 - access("SITE_ID"=100)
Statistics
---------------------------------------------------------0 recursive calls
0 db block gets
0 physical reads
0 redo size
0 sorts (memory)
0 sorts (disk)
8941 rows processed
SQL>
2
3
4

and sid=12359;

NAME
CLASS
VALUE
----------------------------------- ---------- ---------table fetch by rowid
64
8961
64
0

Checking clustering factor to detect row migration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Checking clustering factor to detect row migration

Similar to Checking clustering factor to detect row migration (20)

Recently uploaded

Recently uploaded (20)

Checking clustering factor to detect row migration