1. How to “tune” a query
Thomas Kyte
http://asktom.oracle.com/
2. Agenda
• What single thing has the most impact on
performance and scalability?
• Doing It Wrong
• Doing It Still Wrong
• Tuning A Query
• Adding Information
• Connection Management
6. Some Basic Math
• You write a routine that executes in 1ms
(1/1000th of a second)
• You have 5,000,000 things to process
• That is 5,000 seconds!
• That is 83 minutes!
• That is almost 1.5 hours!!!
• Oh, but we’ll do parallel threads in our
application…
7. Would you like to load
and validate one row
5,000,000 times
or load and validate
5,000,000 rows once?
8. create or replace procedure slow_by_slow
as
begin
for x in (select rowid rid, object_name
from t t_slow_by_slow)
loop
x.object_name := substr(x.object_name,2)
||substr(x.object_name,1,1);
update t
set object_name = x.object_name
where rowid = x.rid;
end loop;
end;
/
9. create or replace procedure bulk
as
type ridArray is table of rowid;
type onameArray is table
of t.object_name%type;
cursor c is select rowid rid, object_name
from t t_bulk;
l_rids ridArray;
l_onames onameArray;
N number := 100;
begin
open c;
loop
fetch c bulk collect
into l_rids, l_onames limit N;
for i in 1 .. l_rids.count
loop
l_onames(i) := substr(l_onames(i),2)
||substr(l_onames(i),1,1);
end loop;
forall i in 1 .. l_rids.count
update t
set object_name = l_onames(i)
where rowid = l_rids(i);
exit when c%notfound;
end loop;
close c;
end;
12. Method CPU units of time
Slow by Slow 495
Bulk 193 (39%)
Single SQL statement 91 (18%) *near order of magnitude
CTAS 28 (5%) *two orders of magnitude
16. Code…
• Write as much code:
– As you have to
– But as little as you can…
• Think in SETS
• Use (really use – not just ‘use’) SQL
Begin
For x in ( select * from table@remote_db )
Loop
Insert into table ( c1, c2, … )
values ( x.c1, x.c2,… );
End loop;
End;
Insert into table (c1,c2,…)
select c1,c2,…. From table@remote_db
Insert into table (c1,c2,…)
select c1,c2,…. From table@remote_db
LOG ERRORS ( some_variable )
REJECT LIMIT UNLIMITED;
… code to handle errors
for tag some_variable …
17. More Code = More Bugs
Less Code = Less Bugs
• Always look at the procedural code and ask yourself
“is there a set based way to do this algorithm”
– For example …
18. insert into t ( .... )
select EMPNO, STATUS_DATE, ....
from t1, t2, t3, t4, ....
where ....;
loop
delete from t
where (EMPNO,STATUS_DATE)
in ( select EMPNO,
min(STATUS_DATE)
from t
group by EMPNO
having count(1) > 1 );
exit when sql%rowcount = 0;
end loop;
More Code = More Bugs
Less Code = Less Bugs
19. insert into t ( .... )
select EMPNO, STATUS_DATE, ....
from t1, t2, t3, t4, ....
where ....;
loop
delete from t
where (EMPNO,STATUS_DATE)
in ( select EMPNO,
min(STATUS_DATE)
from t
group by EMPNO
having count(1) > 1 );
exit when sql%rowcount = 0;
end loop;
For any set of records with
more than one EMPNO,
remove rows with the oldest
STATUS_DATE.
Additionally – If the last set of
EMPNO records all have the
same STATUS_DATE, remove
them all.
More Code = More Bugs
Less Code = Less Bugs
EMPNO STATUS_DATE
------- ------------
1 01-jan-2009
1 15-jun-2009
1 01-sep-2009
…
1000 01-feb-2009
1000 22-aug-2009
1000 10-oct-2009
1000 10-oct-2009(1,01-jan-2009,3) (1001,01-feb-2009,4)
EMPNO STATUS_DATE
------- ------------
1 01-jan-2009
1 15-jun-2009
1 01-sep-2009
…
1000 01-feb-2009
1000 22-aug-2009
1000 10-oct-2009
1000 10-oct-2009
EMPNO STATUS_DATE
------- ------------
1 01-jan-2009
1 15-jun-2009
1 01-sep-2009
…
1000 01-feb-2009
1000 22-aug-2009
1000 10-oct-2009
1000 10-oct-2009(1,15-jun-2009,2) (1001,22-aug-2009,3)
EMPNO STATUS_DATE
------- ------------
1 01-jan-2009
1 15-jun-2009
1 01-sep-2009
…
1000 01-feb-2009
1000 22-aug-2009
1000 10-oct-2009
1000 10-oct-2009(1001,22-aug-2009,2)
EMPNO STATUS_DATE
------- ------------
1 01-sep-2009
…
20. insert /* APPEND */ into t ( .... )
select EMPNO, STATUS_DATE, ......
from ( select EMPNO, STATUS_DATE, .... ,
max(STATUS_DATE)
OVER ( partition by EMPNO ) max_sd,
count(EMPNO)
OVER ( partition by EMPNO,STATUS_DATE ) cnt
from t1, t2, t3, t4, …
where … )
where STATUS_DATE = max_sd
and cnt = 1;
More Code = More Bugs
Less Code = Less Bugs
• This was a data warehouse load (load 2-3-4 times the data you want,
then delete? Ouch)
• It was wrong – procedural code is no easier to understand than set based
code, documentation is key
21. /* Load table t using history tables. History tables have
multiple records per employee. We need to keep the
history records for each employee that have the maximum
status date for that employee. We do that by computing
the max(status_date) for each employee (partition by EMPNO
finding max(status_date) and keeping only the records such
that the status_date for that record = max(status_date)
for all records with same empno */
insert /* APPEND */ into t ( .... )
select EMPNO, STATUS_DATE, ......
from ( select EMPNO, STATUS_DATE, .... ,
max(STATUS_DATE)
OVER ( partition by EMPNO ) max_sd
from t1, t2, t3, t4, …
where … )
where STATUS_DATE = max_sd;
More Code = More Bugs
Less Code = Less Bugs
22. /* Load table t using history tables. History tables have
multiple records per employee. We need to keep the
history records for each employee that have the maximum
status date for that employee. We do that by numbering
each history record by empno, keeping only the records such
that it is the first record for a EMPNO after sorting by
status_date desc. REALIZE: this is not deterministic if
there are two status_dates that are the same for a given
employee! */
insert /* APPEND */ into t ( .... )
select EMPNO, STATUS_DATE, ......
from ( select EMPNO, STATUS_DATE, .... ,
row_number() OVER ( partition by EMPNO
order by STATUS_DATE desc ) rn
from t1, t2, t3, t4, …
where … )
where rn=1;
More Code = More Bugs
Less Code = Less Bugs
24. Use PL/SQL constructs only when SQL cannot do it
• Another coding ‘technique’ I see frequently:
• The developer did not want to “burden” the database
with a join
For a in ( select * from t1 )
Loop
For b in ( select * from t2
where t2.key = a.key )
Loop
For c in ( select * from t3
where t3.key = b.key )
Loop
…
25. Create or replace function get_xxxx( … ) return …
• Be fearful of functions that start with GET_
• You cannot tune them, they are already tuned
– They use the simplest of SQL
26. Create or replace function get_emp_details
( p_id in number ) return emp_record
As
l_rec emp_record;
l_id number;
Begin
select max(id2)
into l_id
from t1
where id = p_id;
27. Create or replace function get_emp_details
( p_id in number ) return emp_record
As
l_rec emp_record;
l_id number;
Begin
select max(id2)
into l_id
from t1
where id = p_id;
select first_name, last_name
into l_rec.first_name, l_rec.last_name
from t2
where id = l_id;
28. select first_name, last_name
into l_rec.first_name, l_rec.last_name
from t2
where id = l_id;
select email_address
into l_rec.email_address
from t2
where id = l_id;
select title
into l_title
from t2
where id = l_id;
30. select *
into l_rec
from t2
where id = (select max(id2) from t1 where id = p_id);
• Why was that more than:
• And even then, why did it exist at all – they
probably call this function in a loop
– For x in (…)
• For y in (…query based on x…)
– For z in (…query based on x,y…)
32. The Schema Matters
• A Lot!
• Tune this query:
Select DOCUMENT_NAME, META_DATA
from documents
where userid=:x;
• That is about as easy as it gets (the SQL)
• Not too much we can do to rewrite it…
• But we’d like to make it better.
Iot01.sql
Cf.sql
34. ops$tkyte%ORA10GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 2 |
| 1 | SORT AGGREGATE | | 1 | 65 | |
| 2 | NESTED LOOPS | | 1 | 65 | 2 |
| 3 | NESTED LOOPS | | 1 | 52 | 2 |
| 4 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 26 | 1 |
|* 5 | INDEX RANGE SCAN | T3_OLS_RS_1 | 1 | | 1 |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 26 | 1 |
|* 7 | INDEX UNIQUE SCAN | T2_PK1 | 1 | | |
|* 8 | INDEX UNIQUE SCAN | T1_PK1 | 1 | 13 | |
------------------------------------------------------------------------------
The optimizer is getting smarter than we are…
35. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> CREATE TABLE T1
2 (
3 ORDER_ID NUMBER(18) NOT NULL,
4 ACCOUNT_NO NUMBER(10) NOT NULL,
5 ORDER_NUMBER VARCHAR2(20) NOT NULL,
6 data varchar2(1000)
7 );
Table created.
ops$tkyte%ORA11GR2> ALTER TABLE T1 ADD CONSTRAINT T1_PK1 PRIMARY KEY (ORDER_ID);
Table altered.
36. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> CREATE TABLE T2
2 (
3 SERVICE_ORDER_ID NUMBER(18) NOT NULL,
4 ORDER_ID NUMBER(18) NOT NULL,
5 ORDER_STATUS_ID NUMBER(6) NOT NULL,
6 data varchar2(1000)
7 );
Table created.
ops$tkyte%ORA11GR2> ALTER TABLE T2 ADD CONSTRAINT T2_PK1
2 PRIMARY KEY (SERVICE_ORDER_ID);
Table altered.
ops$tkyte%ORA11GR2> ALTER TABLE T2 ADD CONSTRAINT T2_OSO_FK1
2 FOREIGN KEY (ORDER_ID) REFERENCES T1 (ORDER_ID);
Table altered.
37. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> CREATE TABLE T3
2 (
3 SERVICE_ORDER_ID NUMBER(18) NOT NULL,
4 RELATED_SERVICE_ORDER_ID NUMBER(18),
5 data varchar2(1000)
6 );
Table created.
ops$tkyte%ORA11GR2> ALTER TABLE T3 ADD CONSTRAINT T3_ORDER_PK1
2 PRIMARY KEY (SERVICE_ORDER_ID);
Table altered.
ops$tkyte%ORA11GR2> ALTER TABLE T3 ADD CONSTRAINT T3_OLS_S_FK1
2 FOREIGN KEY (SERVICE_ORDER_ID) REFERENCES T2 (SERVICE_ORDER_ID);
Table altered.
ops$tkyte%ORA11GR2> CREATE INDEX T3_OLS_RS_1
2 ON T3 (RELATED_SERVICE_ORDER_ID);
Index created.
38. ops$tkyte%ORA10GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 2 |
| 1 | SORT AGGREGATE | | 1 | 65 | |
| 2 | NESTED LOOPS | | 1 | 65 | 2 |
| 3 | NESTED LOOPS | | 1 | 52 | 2 |
| 4 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 26 | 1 |
|* 5 | INDEX RANGE SCAN | T3_OLS_RS_1 | 1 | | 1 |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 26 | 1 |
|* 7 | INDEX UNIQUE SCAN | T2_PK1 | 1 | | |
|* 8 | INDEX UNIQUE SCAN | T1_PK1 | 1 | 13 | |
------------------------------------------------------------------------------
The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | INDEX RANGE SCAN| T3_OLS_RS_1 | 1 | 26 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------
39. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
• First, it knows the outer join is not necessary
– Where t2.col = t3.col(+) and t3.anything =
‘something’
– Implies the (+) is not necessary
• If the outer join ‘happened’, then t3.anything
would be NULL! And t3.anything =
to_number(:v0) would never be satisfied
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
40. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
• Second, it knows that T1 is not relevant to the query
– Nothing is selected from T1 in the output
– T1(order_id) is the primary key, joined to T2(order_id) – so T2
is “key preserved”
– T2(order_id) is NOT NULL and is a foreign key to T1
– Therefore, when you join T1 to T2 – every row in T2 appears
at least once and at most once in the output
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T2, T3
3 WHERE T2.service_order_id = T3.service_order_id
4 AND T3.related_service_order_id = TO_NUMBER(:v0);
41. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
• Lastly, it knows that T2 is not relevant to the query
– Nothing is selected from T2 in the output
– T2(service_order_id) is the primary key, joined to
T3(service_order_id) – so T3 is “key preserved”
– T3(service_order_id) is NOT NULL and is a foreign key to T2
– Therefore, when you join T2 to T3 – every row in T3 appears
at least once and at most once in the output
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T3
3 WHERE T3.related_service_order_id = TO_NUMBER(:v0);
42. The optimizer is getting smarter than we are…
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T1, T2, T3
3 WHERE T2.order_id = T1.order_id
4 AND T2.service_order_id = T3.service_order_id (+)
5 AND T3.related_service_order_id = TO_NUMBER(:v0);
ops$tkyte%ORA11GR2> SELECT COUNT(*)
2 FROM T3
3 WHERE T3.related_service_order_id = TO_NUMBER(:v0);
Is the same as…. But only
because of the constraints in
place…