Database statistics are not limited to tables, columns, and indexes. PL/SQL functions also have a number of associated statistics, namely costs (CPU, I/O, network), selectivity, and cardinality (for functions that return collections). These statistics have default values that only somewhat represent reality. However, these values are always used by Oracle's cost-based optimizer to build execution plans. This session uses real-life examples to illustrate how properly managed PL/SQL statistics can significantly improve executions plans. It also demonstrates that Oracle's extensible optimizer is flexible enough to support packaged functions.
AWS Community Day CPH - Three problems of Terraform
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
1. 1 of 44
The Hidden Face
of the Cost Based Optimizer:
PL/SQL-Specific Statistics
[UGF2781]
Michael Rosenblum
Dulcian, Inc.
www.dulcian.com
2. 2 of 44
Who Am I? – “Misha”
Oracle ACE
Co-author of 3 books
PL/SQL for Dummies
Expert PL/SQL Practices
Oracle PL/SQL Performance Tuning Tips & Techniques
(Rosenblum & Dorsey, Oracle Press, July 2014)
Won ODTUG 2009 Speaker of the Year
Known for:
SQL and PL/SQL tuning
Complex functionality
Code generators
Repository-based development
3. 3 of 44
Did you know that…?
User-defined functions have a number of statistics
associated with them.
These statistics impact decisions made by the Cost Based
Optimizer (CBO).
Default values of these statistics are …well…less than
adequate.
… but you can adjust them manually!
4. 4 of 44
Defaults
Hardware resources
CPU cost – 3000 [CPU instructions]
I/O cost – 0 [data blocks to be read/written]
Network cost – 0 [data blocks to be read/written]
Cardinality – 8168 [rows]
Selectivity – 1% [out of total set]
6. 6 of 44
Basic Case
Problem:
There are two functions in SQL statement.
You want to tell CBO that one of them is expensive.
Solution:
ASSOCIATE STATISTICS WITH FUNCTIONS f_light_tx
DEFAULT COST (0,0,0) /* CPU,IO,Network */; -- light
ASSOCIATE STATISTICS WITH FUNCTIONS f_heavy_tx
DEFAULT COST (99999,99999,99999); -- heavy
7. 7 of 44
Impact
SQL> set autotrace on explain
SQL> SELECT count(*) FROM emp
2 WHERE f_heavy_tx(deptno) = 'A' OR f_light_tx(empno) = 'B';
COUNT(*)
----------
0
Execution Plan
-----------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
---------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 |
| 1 | SORT AGGREGATE | | 1 | 7 |
|* 2 | TABLE ACCESS FULL| EMP | 1 | 7 |
---------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("F_LIGHT_TX"("EMPNO")='B‘ OR "F_HEAVY_TX"("DEPTNO")='A')
Change of execution
order
8. 8 of 44
Increasing Complexity
Problem:
Hardcoding high values is a cheat
… although, in some cases it may be enough.
Solution:
Get actual statistics
… if you can simulate a real world environment.
9. 9 of 44
Measuring Statistics – Sample Function
CREATE FUNCTION f_getdeptinfo_tx (i_deptno NUMBER)
RETURN VARCHAR2 IS
v_out_tx VARCHAR2(256);
BEGIN
SELECT dname
INTO v_out_tx
FROM scott.dept@remoteDB
WHERE deptno = i_deptno;
SELECT v_out_tx||':'||count(*)
INTO v_out_tx
FROM scott.emp
WHERE deptno = i_deptno;
RETURN v_out_tx;
END;
Remote call
10. 10 of 44
Measuring Statistics - Snapshot Before
SQL> SELECT f_getdeptinfo_tx (10) FROM DUAL;
SQL> SELECT a.name, b.value
2 FROM v$statname a, v$mystat b
3 WHERE a.statistic# = b.statistic#
4 AND name IN ('db block gets', -- physical reads
5 'consistent gets', -- logical reads
6 'CPU used by this session', -- CPU
7 'bytes sent via SQL*Net to dblink', -- DB-link
8 'bytes received via SQL*Net from dblink' -- DB-link
9 );
NAME VALUE
-------------------------------------- ----------
CPU used by this session 9
db block gets 12
consistent gets 226
bytes sent via SQL*Net to dblink 3459
bytes received via SQL*Net from dblink 4070
Cause parse
and ignore
11. 11 of 44
Measuring Statistics - Snapshot After
SQL> SELECT f_getdeptinfo_tx (10) FROM DUAL;
SQL> SELECT a.name, b.value
2 FROM v$statname a, v$mystat b
3 WHERE a.statistic# = b.statistic#
4 AND name IN ('db block gets', -- physical reads
5 'consistent gets', -- logical reads
6 'CPU used by this session', -- CPU
7 'bytes sent via SQL*Net to dblink', -- DB-link
8 'bytes received via SQL*Net from dblink' -- DB-link
9 );
NAME VALUE
-------------------------------------- ----------
CPU used by this session 11 [was 9]
db block gets 12 [was 12]
consistent gets 232 [was 226]
bytes sent via SQL*Net to dblink 4113 [was 3459]
bytes received via SQL*Net from dblink 4603 [was 4070]
Real call
12. 12 of 44
Real Numbers
Difference:
CPU time = 2 hs
I/O = 6 blocks
Physical Reads = 0 blocks
Logical Reads = 6 blocks
Network = 2 blocks
Sent via DBLink = 654 bytes ~ 1 block
Received via DBLink = 533 bytes ~ 1 block
Needs translation
13. 13 of 44
CPU Time Format Conversion
Convert CPU time into CPU instructions:
SQL> DECLARE
2 v_units_nr NUMBER;
3 v_time_nr NUMBER:=0.02; -- time in seconds
4 BEGIN
5 v_units_nr:=
6 DBMS_ODCI.ESTIMATE_CPU_UNITS (v_time_nr)* 1000;
7 DBMS_OUTPUT.PUT_LINE
8 ('Instructions:'||round(v_units_nr));
9 END;
10 /
Instructions:18783086
Function output is in
thousands of instructions
14. 14 of 44
Final Step
Associate real statistics:
ASSOCIATE STATISTICS WITH FUNCTIONS f_getDeptInfo_tx
DEFAULT COST (
18783086, -- CPU instructions
6, -- local IO
2); -- network
16. 16 of 44
Problem
Task:
Multiple functions in the same package.
Need to associate different statistics with each of them.
Problem:
No syntax to hard code statistics using the
PACKAGE.FUNCTION format.
… but you can hardcode statistics to the whole package (i.e. all
functions will share the same numbers)
… via: ASSOCIATE STATISTICS WITH PACKAGES <name>
Solution:
ODCI object type interface!
17. 17 of 44
Sample Package
CREATE PACKAGE perf_pkg IS
FUNCTION f_heavy_tx (i_deptno NUMBER) RETURN VARCHAR2;
FUNCTION f_light_tx (i_empno NUMBER) RETURN VARCHAR2;
FUNCTION f_medium_tx (i_name VARCHAR2) RETURN VARCHAR2;
END;
CREATE OR REPLACE PACKAGE BODY perf_pkg is
FUNCTION f_heavy_tx (i_deptno NUMBER) RETURN VARCHAR2 IS
BEGIN RETURN 'heavy:'||i_deptno; END;
FUNCTION f_light_tx (i_empno NUMBER) RETURN VARCHAR2 IS
BEGIN RETURN 'light:'||i_empno; END;
FUNCTION f_medium_tx (i_name VARCHAR2) RETURN VARCHAR2 IS
BEGIN RETURN initcap(i_name); END;
END;
18. 18 of 44
Key Discovery
ODCI interface:
Does not care about names of function parameters, but does
care about datatypes:
You need to record all possible combinations of inputs.
In this case:
2 functions with NUMBER inputs
1 function with VARCHAR2 input
19. 19 of 44
Object Type (1)
CREATE OR REPLACE TYPE function_stat_oty AS OBJECT (
dummy_attribute NUMBER,
STATIC FUNCTION ODCIGetInterfaces (p_interfaces OUT sys.odciobjectlist)
RETURN NUMBER,
STATIC FUNCTION ODCIStatsFunctionCost
(p_func_info IN sys.odcifuncinfo,
p_cost OUT sys.odcicost,
p_args IN sys.odciargdesclist,
i_single_nr IN NUMBER,
p_env IN sys.odcienv) RETURN NUMBER,
STATIC FUNCTION ODCIStatsFunctionCost
(p_func_info IN sys.odcifuncinfo,
p_cost OUT sys.odcicost,
p_args IN sys.odciargdesclist,
i_single_tx IN varchar2,
p_env IN sys.odcienv) RETURN NUMBER
)
One function for each
datatype permutation
20. 20 of 44
Object Type (2)
CREATE OR REPLACE TYPE BODY function_stat_oty as
STATIC FUNCTION ODCIGetInterfaces
(p_interfaces OUT sys.odciobjectlist)
RETURN NUMBER IS
BEGIN
p_interfaces := sys.odciobjectlist(
sys.odciobject ('sys', 'odcistats2')
);
RETURN odciconst.success;
END odcigetinterfaces;
Required by ODCI
21. 21 of 44
Object Type (3)
STATIC FUNCTION ODCIStatsFunctionCost
(p_func_info IN sys.odcifuncinfo,
p_cost OUT sys.odcicost,
p_args IN sys.odciargdesclist,
i_single_nr IN NUMBER,
p_env IN sys.odcienv
) RETURN NUMBER IS
BEGIN
IF LOWER(p_func_info.methodname) LIKE '%heavy%' THEN
p_cost := sys.odcicost
(cpucost=>NULL,
iocost=>1000,
networkcost=>NULL,
indexcostinfo=>NULL);
END IF;
RETURN odciconst.success;
END;
Record type containing:
- ObjectSchema
- ObjectName – name of package
or standalone function
- MethodName – name of
packaged function
- Flags
22. 22 of 44
Object Type (4)
STATIC FUNCTION ODCIStatsFunctionCost
(p_func_info IN sys.odcifuncinfo,
p_cost OUT sys.odcicost,
p_args IN sys.odciargdesclist,
i_single_tx IN VARCHAR2,
p_env IN sys.odcienv
) RETURN NUMBER IS
BEGIN
IF LOWER(p_func_info.methodname) LIKE '%medium%' THEN
p_cost := sys.odcicost(NULL, 10, NULL, NULL);
END IF;
RETURN odciconst.success;
END;
END;
Second permutation
23. 23 of 44
Test Case
SQL> ASSOCIATE STATISTICS WITH PACKAGES perf_pkg
2 USING function_stat_oty;
SQL> SET AUTOTRACE ON EXPLAIN
SQL> SELECT count(*) FROM emp
2 WHERE perf_pkg.f_heavy_tx(empno)='A'
3 OR perf_pkg.f_light_tx(deptno)='B'
4 OR perf_pkg.f_medium_tx(job)='C';
COUNT(*)
----------
0
Starting from HEAVY
24. 24 of 44
Impact
Execution Plan
----------------------------------------------------------
Plan hash value: 2083865914
----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 15 | 13863 (0)|
| 1 | SORT AGGREGATE | | 1 | 15 | |
|* 2 | TABLE ACCESS FULL| EMP | 1 | 15 | 13863 (0)|
----------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("PERF_PKG"."F_LIGHT_TX"("DEPTNO")='B' OR
"PERF_PKG"."F_MEDIUM_TX"("JOB")='C' OR
"PERF_PKG"."F_HEAVY_TX"("EMPNO")='A')
… but CBO started
from LIGHT!
26. 26 of 44
Issue
Task:
Use Object Collections as a part of a SQL statement with the
TABLE clause.
Problem:
Oracle’s default cardinality of the collection causes the CBO
to make bad decisions.
27. 27 of 44
Test Case
-- create table
CREATE TABLE inlist_tab AS
SELECT object_id, created, object_type
FROM all_objects
WHERE object_id IS NOT NULL;
ALTER TABLE inlist_tab
ADD CONSTRAINT inlist_tab_pk PRIMARY KEY (object_id) USING INDEX;
BEGIN
dbms_stats.gather_table_stats(user,'INLIST_TAB');
END;
-- create object collection
CREATE TYPE id_tt IS TABLE OF NUMBER;
28. 28 of 44
Problem Illustration
SELECT /*+ gather_plan_statistics */ MAX(created)
FROM inlist_tab
WHERE object_id IN (SELECT t.column_value
FROM TABLE(id_tt(100,101)) t)
-- run DBMS_XPLAN.DISPLAY_CURSOR
-----------------------------------------------------------------------
|Id | Operation |Name |E-Rows|A-Rows
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | 1
| 1 | SORT AGGREGATE | | 1| 1
|*2 | HASH JOIN | | 8168| 2
| 3 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 8168| 2
| 4 | TABLE ACCESS FULL |INLIST_TAB| 29885| 89761
-----------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=VALUE(KOKBF$))
Only 2 objects…
… cause full-table scan!
29. 29 of 44
Possible Options
Hints:
CARDINALITY hint – manual cardinality override
Pro: Simple
Con: Hardcoded
DYNAMIC_SAMPLING – let Oracle check the data
Pro: Avoid hard coding
Con: Involves extra SQL overhead, while PL/SQL already knows
how many objects are in the collection
30. 30 of 44
Impact of Hints
SELECT /*+ gather_plan_statistics */ MAX(created)
FROM inlist_tab
WHERE object_id IN (
SELECT /*+ dynamic_sampling (t 2) */ t.column_value
-- SELECT /*+ cardinality (t 2) */ t.column_value
FROM TABLE(id_tt(227011,227415)) t)
--------------------------------------------------------------------------
|Id|Operation |Name |E-Rows |A-Rows
--------------------------------------------------------------------------
| 0|SELECT STATEMENT | | | 1
| 1| SORT AGGREGATE | | 1 | 1
| 2| NESTED LOOPS | | | 2
| 3| NESTED LOOPS | | 2 | 2
| 4| COLLECTION ITERATOR CONSTRUCTOR FETCH| | 2 | 2
|*5| INDEX UNIQUE SCAN |INLIST_TAB_PK| 1 | 2
| 6| TABLE ACCESS BY INDEX ROWID |INLIST_TAB | 1 | 2
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("OBJECT_ID"=VALUE(KOKBF$))
Using index!
31. 31 of 44
+ Extra Option - ODCI
ODCI interface:
ODCIStatsTableFunction method
It can work only with functions
… so if you need to use a default constructor, you must create a
“transmitter” function that would have associated statistics:
CREATE OR REPLACE FUNCTION MyCard(i_tt id_tt)
RETURN id_tt IS
BEGIN
RETURN i_tt;
END;
32. 32 of 44
Object Type (1)
CREATE TYPE MyCard_OT AS OBJECT (
dummy_attribute NUMBER,
STATIC FUNCTION ODCIGetInterfaces
(p_interfaces OUT SYS.ODCIObjectList)
RETURN NUMBER,
STATIC FUNCTION ODCIStatsTableFunction (
p_function IN SYS.ODCIFuncInfo,
p_stats OUT SYS.ODCITabFuncStats,
p_args IN SYS.ODCIArgDescList,
i_tt IN id_tt)
RETURN NUMBER
); Object collection as input
33. 33 of 44
Object Type (2)
CREATE TYPE BODY MyCard_OT AS
STATIC FUNCTION ODCIGetInterfaces ...
STATIC FUNCTION ODCIStatsTableFunction
(p_function IN SYS.ODCIFuncInfo,
p_stats OUT SYS.ODCITabFuncStats,
p_args IN SYS.ODCIArgDescList,
i_tt IN id_tt) RETURN NUMBER IS
BEGIN
p_stats := SYS.ODCITabFuncStats(i_tt.COUNT);
RETURN ODCIConst.success;
END ODCIStatsTableFunction;
END;
Set
statistics
34. 34 of 44
Impact of Statistics
ASSOCIATE STATISTICS WITH FUNCTIONS MyCard USING mycard_ot;
SELECT /*+ gather_plan_statistics*/ MAX(created)
FROM inlist_tab
WHERE object_id IN (SELECT t.column_value
FROM table(MyCard(id_tt(100,101))) t)
---------------------------------------------------------------------
|Id|Operation |Name |E-Rows|A-Rows
---------------------------------------------------------------------
| 0|SELECT STATEMENT | | | 1
| 1| SORT AGGREGATE | | 1| 1
| 2| NESTED LOOPS | | | 2
| 3| NESTED LOOPS | | 2| 2
| 4| COLLECTION ITERATOR PICKLER FETCH|MYCARD | 2| 2
|*5| INDEX UNIQUE SCAN |INLIST_TAB_PK| 1| 2
| 6| TABLE ACCESS BY INDEX ROWID |INLIST_TAB | 1| 2
---------------------------------------------------------------------
Predicate Information (identified by operation id):
--------------------------------------------------
5 - access("OBJECT_ID"=VALUE(KOKBF$))
Valid stats
Constructor is wrapped
36. 36 of 44
Issue
Default:
1% ~ if you compare function to a literal, only every 1/100th row
satisfies the condition.
Problem:
If your function is heavily skewed CBO predicate analysis will
generate bad execution plans.
Solutions:
ODCI object method can adjust selectivity setting.
You can also hard code selectivity (if data is static):
ASSOCIATE STATISTICS WITH FUNCTIONS f_isSenior_yn
DEFAULT selectivity 50;
37. 37 of 44
Test Case
CREATE OR REPLACE FUNCTION f_isSenior_yn (i_job_id VARCHAR2)
RETURN VARCHAR2 IS
BEGIN
IF i_job_id IN ('AD_PRES','AD_VP') THEN
RETURN 'Y';
ELSE
RETURN 'N';
END IF;
END;
SQL> select f_isSenior_yn(job_id) isSenior_yn, count(*)
2 from hr.employees
3 group by f_isSenior_yn(job_id);
ISSENIOR_YN COUNT(*)
----------- ----------
Y 3
N 104
38. 38 of 44
Problem Illustration
SELECT /*+ gather_plan_statistics*/
e.*,
d.department_name
FROM hr.employees e,
hr.departments d
WHERE e.department_id = d.department_id
AND f_isSenior_yn(e.job_id)='N'
--------------------------------------------------------------------
|Id|Operation |Name |E-Rows|A-Rows|Buffers|Used-Mem |
--------------------------------------------------------------------
| 0|SELECT STATEMENT | | | 103| 14| |
|*1| HASH JOIN | | 1| 103| 14| 901K(0)|
|*2| TABLE ACCESS FULL|EMPLOYEES | 1| 104| 7| |
| 3| TABLE ACCESS FULL|DEPARTMENTS| 1| 27| 7| |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMPLOYEES"."DEPARTMENT_ID"="DEPARTMENTS"."DEPARTMENT_ID")
2 - filter("F_ISSENIOR_YN"("EMPLOYEES"."JOB_ID")='N')
1% of 107 is way off!
Heavy memory usage
39. 39 of 44
Object Type (1)
CREATE TYPE MySelect_OT AS OBJECT (
dummy_attribute NUMBER,
STATIC FUNCTION ODCIGetInterfaces
(p_interfaces OUT SYS.ODCIObjectList) RETURN NUMBER,
STATIC FUNCTION ODCIStatsSelectivity (
p_pred_info IN SYS.ODCIPredInfo,
p_selectivity OUT NUMBER,
p_args IN SYS.ODCIArgDescList,
p_start IN VARCHAR2,
p_stop IN VARCHAR2,
i_job IN VARCHAR2,
p_env IN SYS.ODCIEnv)
RETURN NUMBER
);
40. 40 of 44
Object Type (2)
CREATE or replace TYPE BODY MySelect_OT AS
STATIC FUNCTION ODCIGetInterfaces …;
STATIC FUNCTION ODCIStatsSelectivity (
p_pred_info IN SYS.ODCIPredInfo,
p_selectivity OUT NUMBER,
p_args IN SYS.ODCIArgDescList,
p_start IN VARCHAR2,
p_stop IN VARCHAR2,
i_job IN VARCHAR2,
p_env IN SYS.ODCIEnv
) RETURN NUMBER IS
BEGIN
if p_start='Y' then
p_selectivity:=3;
else
p_selectivity:=97;
end if;
RETURN ODCIConst.success;
END ODCIStatsSelectivity;
END;
START is used
for ‘=‘ comparison
START and STOP are used
for ‘BETWEEN‘ comparison
41. 41 of 44
Impact of Statistics
ASSOCIATE STATISTICS WITH FUNCTIONS f_isSenior_yn USING MySelect_OT;
SELECT /*+ gather_plan_statistics*/ e.*,
d.department_name
FROM hr.employees e,
hr.departments d
WHERE e.department_id = d.department_id
AND f_isSenior_yn(e.job_id)='N'
-----------------------------------------------------------------------------
|Id|Operation |Name |E-Rows|A-Rows|Buffers|Used-Mem|
-----------------------------------------------------------------------------
| 0|SELECT STATEMENT | | | 103| 9| |
| 1| MERGE JOIN | | 103| 103| 9| |
| 2| TABLE ACCESS BY INDEX ROWID|DEPARTMENTS| 27| 27| 2| |
| 3| INDEX FULL SCAN |DEPT_ID_PK | 27| 27| 1| |
|*4| SORT JOIN | | 104| 103| 7|14336(0)|
|*5| TABLE ACCESS FULL |EMPLOYEES | 104| 104| 7| |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("EMPLOYEES"."DEPARTMENT_ID"="DEPARTMENTS"."DEPARTMENT_ID")
filter("EMPLOYEES"."DEPARTMENT_ID"="DEPARTMENTS"."DEPARTMENT_ID")
5 - filter("F_ISSENIOR_YN"("EMPLOYEES"."JOB_ID")='N')
Different execution plan!
Much less memory
42. 42 of 44
Summary
Keeping Oracle Statistics up to date is important
… otherwise the CBO can get confused.
Manual management of PL/SQL function statistics is non-
trivial.
… so it should be used only when needed.
Using PL/SQL functions within SQL should be tightly
controlled
… because usually statistics are the least of your problems!
43. 43 of 44
Contact Information
Michael Rosenblum – mrosenblum@dulcian.com
Dulcian, Inc. website - www.dulcian.com
Blog: wonderingmisha.blogspot.com
Available NOW:
Oracle PL/SQL Performance Tuning Tips & Techniques
44. 44 of 44
Save the Date
COLLABORATE 17 registration will open on Thursday, October 27.
Call for Speakers
Submit your session presentation! The Call for Speakers is open until Friday,
October 7
collaborate.ioug.org