Beating Oracle‘s Optimizer at
its own game
by
Rainer Schuettengruber
Raiffeisen Informatik Centre Steiermark
about me
• IT employee since 1998
• main focus on Oracle databases
• various positions as Oracle DBA(DMA), developer, devops
• currently employed as Exadata Administrator
• active member of pyGRAZ
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
the double-edged sword
1000000P1: 1000000
P2: 1000000
P3: 1000000
P4: 1000000
P5: 1000
PART_TAB MASTER_TAB
the double-edged sword
select distinct a.text
from sample.master_tab a inner join
sample.part_tab b
on (a.id = b.master_id)
where b.part_id = 1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Pstart| Pstop |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | |
| 1 | HASH UNIQUE | | 1 | | |
|* 2 | HASH JOIN | | 1000K| | |
| 3 | PARTITION RANGE SINGLE | | 1000K| 1 | 1 |
|* 4 | TABLE ACCESS STORAGE FULL| PART_TAB | 1000K| 1 | 1 |
| 5 | TABLE ACCESS STORAGE FULL | MASTER_TAB | 1000K| | |
scanning partition P1 through P4
hash join is the operation
of choice, since both sets
have equal cardinality
the double-edged sword
select distinct a.text
from sample.master_tab a inner join
sample.part_tab b
on (a.id = b.master_id)
where b.part_id = 5;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Pstart| Pstop |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | |
| 1 | HASH UNIQUE | | 1 | | |
| 2 | NESTED LOOPS | | 1000 | | |
| 3 | NESTED LOOPS | | 1000 | | |
| 4 | PARTITION RANGE SINGLE| | 1000 | 5 | 5 |
|* 5 | TABLE ACCESS STORAGE | PART_TAB | 1000 | 5 | 5 |
|* 6 | INDEX UNIQUE SCAN | PK_MASTER_TAB | 1 | | |
| 7 | TABLE ACCESS BY INDEX | MASTER_TAB | 1 | | |
scanning partition P5
index access/nested loop
since partition P5 contains
only 1000 rows
the double-edged sword
select distinct a.text
from sample.master_tab a inner join
sample.part_tab b
on (a.id = b.master_id)
where b.part_id = :1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Pstart| Pstop |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | |
| 1 | HASH UNIQUE | | 1 | | |
|* 2 | HASH JOIN | | 800K| | |
| 3 | PARTITION RANGE SINGLE | | 800K| KEY | KEY |
|* 4 | TABLE ACCESS STORAGE FULL| PART_TAB | 800K| KEY | KEY |
| 5 | TABLE ACCESS STORAGE FULL | MASTER_TAB | 1000K| | |
bind variables
hash join since the
optimizer can not
determine which partitions
are scanned
the double-edged sword
• 2 different execution plans for the same query
• does harm when bind variables appear on the scene
• addressed with Adaptive Cursor Sharing
• however, the optimizer might stick to a wrong execution plan, leading
to performance deterioration
the double-edged sword
create or replace procedure sample.batch as
cursor cur_part_scan (p_part_id number) is
select distinct a.text
from master_tab a inner join
part_tab b
on (a.id = b.master_id)
where b.part_id = p_part_id;
result master_tab.text%type;
start_time timestamp;
end_time timestamp;
begin
start_time := current_timestamp;
for i in reverse 1 .. 5 loop
open cur_part_scan(i);
fetch cur_part_scan into result;
close cur_part_scan;
end loop;
end_time := current_timestamp;
dbms_output.put_line('duration of the batch run : ' || to_char(end_time - start_time));
end batch;
scans partition P1 to P5 in
reverse order
the double-edged sword
SQL>
SQL> exec sample.batch
duration of the batch run : +000000000 00:00:30.317911000
PL/SQL procedure successfully completed.
Elapsed: 00:00:30.37
execution plan based on data in partition P5
the double-edged sword
create or replace procedure sample.batch as
cursor cur_part_scan (p_part_id number) is
select distinct a.text
from master_tab a inner join
part_tab b
on (a.id = b.master_id)
where b.part_id = p_part_id;
result master_tab.text%type;
start_time timestamp;
end_time timestamp;
begin
start_time := current_timestamp;
for i in 1 .. 5 loop
open cur_part_scan(i);
fetch cur_part_scan into result;
close cur_part_scan;
end loop;
end_time := current_timestamp;
dbms_output.put_line('duration of the batch run : ' || to_char(end_time - start_time));
end batch;
scans partition P1 to P5,
starting with P1
the double-edged sword
SQL>
SQL> exec sample.batch
duration of the batch run : +000000000 00:00:14.356114000
PL/SQL procedure successfully completed.
Elapsed: 00:00:14.38
execution plan based on data in partition P1
considering the total batch duration, an execution plan based on
partition P1 is acceptable
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
the cure
• a particular execution plan can be enforced by means of SQL Profiles
• usually created with GUI based Cloud Control
• however, dealing with the issue in an automatic fashion requires a
programmable API
the cure
SQL> select *
2 from table(dbms_xplan.display_awr(
3 sql_id => '8q70znr29nubg',
4 plan_hash_value => 1580296296,
5 format => 'ADVANCED')
6 );
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
DB_VERSION('11.2.0.4')
ALL_ROWS
OUTLINE_LEAF(@"SEL$58A6D7F6")
MERGE(@"SEL$1")
OUTLINE(@"SEL$2")
OUTLINE(@"SEL$1")
FULL(@"SEL$58A6D7F6" "A"@"SEL$1")
FULL(@"SEL$58A6D7F6" "B"@"SEL$1")
LEADING(@"SEL$58A6D7F6" "A"@"SEL$1" "B"@"SEL$1")
USE_HASH(@"SEL$58A6D7F6" "B"@"SEL$1")
SWAP_JOIN_INPUTS(@"SEL$58A6D7F6" "B"@"SEL$1")
USE_HASH_AGGREGATION(@"SEL$58A6D7F6")
the cure
SQL> declare
2 l_sql_text clob;
3 begin
4 select sql_text into l_sql_text
5 from dba_hist_sqltext
6 where sql_id = '8q70znr29nubg';
7
8 dbms_sqltune.import_sql_profile(
9 sql_text => l_sql_text,
10 name => 'SAMPLE_PROFILE',
11 profile => sqlprof_attr(
12 q'!BEGIN_OUTLINE_DATA!',
13 q'!IGNORE_OPTIM_EMBEDDED_HINTS!',
14 q'!OPTIMIZER_FEATURES_ENABLE('11.2.0.4')!',
15 q'!DB_VERSION('11.2.0.4')!',
16 q'!ALL_ROWS!',
17 q'!OUTLINE_LEAF(@"SEL$58A6D7F6")!',
18 q'!MERGE(@"SEL$1")!',
19 q'!OUTLINE(@"SEL$2")!',
20 q'!OUTLINE(@"SEL$1")!',
outline data, extracted
with dbms_xplan
the cure
SQL>
1 select name,
2 substr(sql_text, 1, 40)
3* from dba_sql_profiles
NAME SUBSTR(SQL_TEXT,1,40)
--------------- ----------------------------------------
SAMPLE_PROFILE SELECT DISTINCT A.TEXT FROM MASTER_TAB A
• a particular execution plan is enforced, thereby leading to stable
performance
the cure
• SQL profiles are the tool of trade to address plan instability issues
• however, manual intervention is required
• in 24x7 environments this might lead to breaking SLA‘s
• dealing with the issue in an appropriate manner requires eliminating the
human factor from the equation
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
a bit machine learning
• from the outlined example it emerges that fixing plan instability
required merely 2 PL/SQL API calls
• however, first and foremost, the performance decline needs to be
detected
• followed by picking the appropriate execution plan as basis for the
SQL Profile
• which is considered as a bit of a dark art
• or, on closer inspection, a case for outlier detection
a bit machine learning
performance decline due to execution plan change in figures
day 1 2 3 4 5 6
buffers/row 109143 109144 109150 109160 109170 1117755
0
200000
400000
600000
800000
1000000
1200000
1 2 3 4 5 6
buffers/row
buffers/row
execution plan
suitable for
partition P5
a bit machine learning
Outlier detection by means of the Modified Z-Score
𝑀𝐴𝐷 = 𝑚𝑒𝑑𝑖𝑎𝑛( 𝑥𝑖 − 𝑥 )
𝑀𝑖 =
0.6745 𝑥𝑖 − 𝑥
𝑀𝐴𝐷
a bit machine learning
Modified Z-Score as code
import numpy as np
def _calculate_modified_z_score(
self,
buffers_per_row_list,
current_buffers_per_row):
buffers_per_row_array = np.array(buffers_per_row_list, dtype=np.float)
median_buffers_per_row = np.median(buffers_per_row_array)
mad = np.median([
np.abs(i - median_buffers_per_row) for i in buffers_per_row_array
])
modified_z_score = (
0.6745
* (current_buffers_per_row - median_buffers_per_row) / mad
)
return modified_z_score
𝑀𝐴𝐷 = 𝑚𝑒𝑑𝑖𝑎𝑛( 𝑥𝑖 − 𝑥 )
𝑀𝑖 =
0.6745 𝑥𝑖 − 𝑥
𝑀𝐴𝐷
a bit machine learning
• a value > 3.5 for the Modified Z-Score is considered as an outlier
• applying the formula for the observation on day 6 gives a value of
34015, it can therefore be deduced that this is an outlier
• based on this conclusion appropriate action is required
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
gluing it all together
• there is a clearly formulated problem
• it is known how to detect it, this is the machine learning bit
• it is known how to fix it by means of programmable API’s
• consequently a piece of software can tackle the issue
gluing it all together
monitor SQL
statements
[ outlier detected ]
create SQL
profile
import cx_Oracle
dbms_sqltune.import_sql_profile
import numpy as np
from daemon import runner
import cx_Oracle
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
recap
• machine learning can help administrators if and when
• the problem it is meant to solve is clearly understood
• it can be expressed in figures
• it can be addressed by means of programmable API’s
• given the above stated criteria are met, enables a machine to make
a sound decision thereby addressing a specific problem
• Python is not a necessity, however, it provides all it takes within its
libraries
• in its very essence contributes to alleviating the operational burden
as far as system administration is concerned
agenda
• the double-edged sword
• the cure
• a bit machine learning
• gluing it all together
• recap
• questions and answers
questions and answers

2018 db-rainer schuettengruber-beating-oracles_optimizer_at_its_own_game-presentation

  • 1.
    Beating Oracle‘s Optimizerat its own game by Rainer Schuettengruber Raiffeisen Informatik Centre Steiermark
  • 2.
    about me • ITemployee since 1998 • main focus on Oracle databases • various positions as Oracle DBA(DMA), developer, devops • currently employed as Exadata Administrator • active member of pyGRAZ
  • 3.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 4.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 5.
    the double-edged sword 1000000P1:1000000 P2: 1000000 P3: 1000000 P4: 1000000 P5: 1000 PART_TAB MASTER_TAB
  • 6.
    the double-edged sword selectdistinct a.text from sample.master_tab a inner join sample.part_tab b on (a.id = b.master_id) where b.part_id = 1; --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Pstart| Pstop | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | | 1 | HASH UNIQUE | | 1 | | | |* 2 | HASH JOIN | | 1000K| | | | 3 | PARTITION RANGE SINGLE | | 1000K| 1 | 1 | |* 4 | TABLE ACCESS STORAGE FULL| PART_TAB | 1000K| 1 | 1 | | 5 | TABLE ACCESS STORAGE FULL | MASTER_TAB | 1000K| | | scanning partition P1 through P4 hash join is the operation of choice, since both sets have equal cardinality
  • 7.
    the double-edged sword selectdistinct a.text from sample.master_tab a inner join sample.part_tab b on (a.id = b.master_id) where b.part_id = 5; --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Pstart| Pstop | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | | 1 | HASH UNIQUE | | 1 | | | | 2 | NESTED LOOPS | | 1000 | | | | 3 | NESTED LOOPS | | 1000 | | | | 4 | PARTITION RANGE SINGLE| | 1000 | 5 | 5 | |* 5 | TABLE ACCESS STORAGE | PART_TAB | 1000 | 5 | 5 | |* 6 | INDEX UNIQUE SCAN | PK_MASTER_TAB | 1 | | | | 7 | TABLE ACCESS BY INDEX | MASTER_TAB | 1 | | | scanning partition P5 index access/nested loop since partition P5 contains only 1000 rows
  • 8.
    the double-edged sword selectdistinct a.text from sample.master_tab a inner join sample.part_tab b on (a.id = b.master_id) where b.part_id = :1; --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Pstart| Pstop | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | | 1 | HASH UNIQUE | | 1 | | | |* 2 | HASH JOIN | | 800K| | | | 3 | PARTITION RANGE SINGLE | | 800K| KEY | KEY | |* 4 | TABLE ACCESS STORAGE FULL| PART_TAB | 800K| KEY | KEY | | 5 | TABLE ACCESS STORAGE FULL | MASTER_TAB | 1000K| | | bind variables hash join since the optimizer can not determine which partitions are scanned
  • 9.
    the double-edged sword •2 different execution plans for the same query • does harm when bind variables appear on the scene • addressed with Adaptive Cursor Sharing • however, the optimizer might stick to a wrong execution plan, leading to performance deterioration
  • 10.
    the double-edged sword createor replace procedure sample.batch as cursor cur_part_scan (p_part_id number) is select distinct a.text from master_tab a inner join part_tab b on (a.id = b.master_id) where b.part_id = p_part_id; result master_tab.text%type; start_time timestamp; end_time timestamp; begin start_time := current_timestamp; for i in reverse 1 .. 5 loop open cur_part_scan(i); fetch cur_part_scan into result; close cur_part_scan; end loop; end_time := current_timestamp; dbms_output.put_line('duration of the batch run : ' || to_char(end_time - start_time)); end batch; scans partition P1 to P5 in reverse order
  • 11.
    the double-edged sword SQL> SQL>exec sample.batch duration of the batch run : +000000000 00:00:30.317911000 PL/SQL procedure successfully completed. Elapsed: 00:00:30.37 execution plan based on data in partition P5
  • 12.
    the double-edged sword createor replace procedure sample.batch as cursor cur_part_scan (p_part_id number) is select distinct a.text from master_tab a inner join part_tab b on (a.id = b.master_id) where b.part_id = p_part_id; result master_tab.text%type; start_time timestamp; end_time timestamp; begin start_time := current_timestamp; for i in 1 .. 5 loop open cur_part_scan(i); fetch cur_part_scan into result; close cur_part_scan; end loop; end_time := current_timestamp; dbms_output.put_line('duration of the batch run : ' || to_char(end_time - start_time)); end batch; scans partition P1 to P5, starting with P1
  • 13.
    the double-edged sword SQL> SQL>exec sample.batch duration of the batch run : +000000000 00:00:14.356114000 PL/SQL procedure successfully completed. Elapsed: 00:00:14.38 execution plan based on data in partition P1 considering the total batch duration, an execution plan based on partition P1 is acceptable
  • 14.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 15.
    the cure • aparticular execution plan can be enforced by means of SQL Profiles • usually created with GUI based Cloud Control • however, dealing with the issue in an automatic fashion requires a programmable API
  • 16.
    the cure SQL> select* 2 from table(dbms_xplan.display_awr( 3 sql_id => '8q70znr29nubg', 4 plan_hash_value => 1580296296, 5 format => 'ADVANCED') 6 ); Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.2.0.4') DB_VERSION('11.2.0.4') ALL_ROWS OUTLINE_LEAF(@"SEL$58A6D7F6") MERGE(@"SEL$1") OUTLINE(@"SEL$2") OUTLINE(@"SEL$1") FULL(@"SEL$58A6D7F6" "A"@"SEL$1") FULL(@"SEL$58A6D7F6" "B"@"SEL$1") LEADING(@"SEL$58A6D7F6" "A"@"SEL$1" "B"@"SEL$1") USE_HASH(@"SEL$58A6D7F6" "B"@"SEL$1") SWAP_JOIN_INPUTS(@"SEL$58A6D7F6" "B"@"SEL$1") USE_HASH_AGGREGATION(@"SEL$58A6D7F6")
  • 17.
    the cure SQL> declare 2l_sql_text clob; 3 begin 4 select sql_text into l_sql_text 5 from dba_hist_sqltext 6 where sql_id = '8q70znr29nubg'; 7 8 dbms_sqltune.import_sql_profile( 9 sql_text => l_sql_text, 10 name => 'SAMPLE_PROFILE', 11 profile => sqlprof_attr( 12 q'!BEGIN_OUTLINE_DATA!', 13 q'!IGNORE_OPTIM_EMBEDDED_HINTS!', 14 q'!OPTIMIZER_FEATURES_ENABLE('11.2.0.4')!', 15 q'!DB_VERSION('11.2.0.4')!', 16 q'!ALL_ROWS!', 17 q'!OUTLINE_LEAF(@"SEL$58A6D7F6")!', 18 q'!MERGE(@"SEL$1")!', 19 q'!OUTLINE(@"SEL$2")!', 20 q'!OUTLINE(@"SEL$1")!', outline data, extracted with dbms_xplan
  • 18.
    the cure SQL> 1 selectname, 2 substr(sql_text, 1, 40) 3* from dba_sql_profiles NAME SUBSTR(SQL_TEXT,1,40) --------------- ---------------------------------------- SAMPLE_PROFILE SELECT DISTINCT A.TEXT FROM MASTER_TAB A • a particular execution plan is enforced, thereby leading to stable performance
  • 19.
    the cure • SQLprofiles are the tool of trade to address plan instability issues • however, manual intervention is required • in 24x7 environments this might lead to breaking SLA‘s • dealing with the issue in an appropriate manner requires eliminating the human factor from the equation
  • 20.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 21.
    a bit machinelearning • from the outlined example it emerges that fixing plan instability required merely 2 PL/SQL API calls • however, first and foremost, the performance decline needs to be detected • followed by picking the appropriate execution plan as basis for the SQL Profile • which is considered as a bit of a dark art • or, on closer inspection, a case for outlier detection
  • 22.
    a bit machinelearning performance decline due to execution plan change in figures day 1 2 3 4 5 6 buffers/row 109143 109144 109150 109160 109170 1117755 0 200000 400000 600000 800000 1000000 1200000 1 2 3 4 5 6 buffers/row buffers/row execution plan suitable for partition P5
  • 23.
    a bit machinelearning Outlier detection by means of the Modified Z-Score 𝑀𝐴𝐷 = 𝑚𝑒𝑑𝑖𝑎𝑛( 𝑥𝑖 − 𝑥 ) 𝑀𝑖 = 0.6745 𝑥𝑖 − 𝑥 𝑀𝐴𝐷
  • 24.
    a bit machinelearning Modified Z-Score as code import numpy as np def _calculate_modified_z_score( self, buffers_per_row_list, current_buffers_per_row): buffers_per_row_array = np.array(buffers_per_row_list, dtype=np.float) median_buffers_per_row = np.median(buffers_per_row_array) mad = np.median([ np.abs(i - median_buffers_per_row) for i in buffers_per_row_array ]) modified_z_score = ( 0.6745 * (current_buffers_per_row - median_buffers_per_row) / mad ) return modified_z_score 𝑀𝐴𝐷 = 𝑚𝑒𝑑𝑖𝑎𝑛( 𝑥𝑖 − 𝑥 ) 𝑀𝑖 = 0.6745 𝑥𝑖 − 𝑥 𝑀𝐴𝐷
  • 25.
    a bit machinelearning • a value > 3.5 for the Modified Z-Score is considered as an outlier • applying the formula for the observation on day 6 gives a value of 34015, it can therefore be deduced that this is an outlier • based on this conclusion appropriate action is required
  • 26.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 27.
    gluing it alltogether • there is a clearly formulated problem • it is known how to detect it, this is the machine learning bit • it is known how to fix it by means of programmable API’s • consequently a piece of software can tackle the issue
  • 28.
    gluing it alltogether monitor SQL statements [ outlier detected ] create SQL profile import cx_Oracle dbms_sqltune.import_sql_profile import numpy as np from daemon import runner import cx_Oracle
  • 29.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 30.
    recap • machine learningcan help administrators if and when • the problem it is meant to solve is clearly understood • it can be expressed in figures • it can be addressed by means of programmable API’s • given the above stated criteria are met, enables a machine to make a sound decision thereby addressing a specific problem • Python is not a necessity, however, it provides all it takes within its libraries • in its very essence contributes to alleviating the operational burden as far as system administration is concerned
  • 31.
    agenda • the double-edgedsword • the cure • a bit machine learning • gluing it all together • recap • questions and answers
  • 32.