SlideShare a Scribd company logo
1 of 37
Download to read offline
blog.sqlora.com@Andrej_SQL
Properly Use Parallel DML for your ETL
Andrej Pashchenko
About me
• Working at Trivadis, Düsseldorf
• Focusing on Oracle:
• Data Warehousing
• Application Development
• Application Performance
• Course instructor „Oracle New Features for
Developers“
@Andrej_SQL blog.sqlora.com
Parallel Processing in Oracle DB
Parallel
Processing
Parallel Query Parallel DDL Parallel DML
SELECT
• CTAS
• CREATE INDEX
• ALTER TABLE MOVE
• …
• Parallel IAS
• Parallel MERGE
• Parallel UPDATE
• Parallel DELETE
Controlling,
Restrictions
and Implications
How to enable PDML
• Parallel Query and Parallel DDL are enabled by default
• Parallel DML has to be enabled first at system or session level:
• In 12c it is also possible with a hint at statement level :
• Issue with the hint: hard parse on every execution, caution with plan stability
• But enabling PDML doesn’t yet mean a parallel execution plan will be used
ALTER SESSION ENABLE PARALLEL DML;
INSERT /*+ enable_parallel_dml parallel append */
INTO sales
SELECT /*+ parallel */ * FROM sales_v;
How do I know PDML was used?
• Check the position of DML, e.g. LOAD AS SELECT, with respect to query coordinator
• Check the note
• Check v$pq_sesstat
---------------------------------------------
Operation | Name
---------------------------------------------
INSERT STATEMENT |
LOAD AS SELECT | T1
PX COORDINATOR |
PX SEND QC (RANDOM) | :TQ1000
OPTIMIZER STATISTICS GATHERING |
PX BLOCK ITERATOR |
TABLE ACCESS FULL | T2
---------------------------------------------
Note
- PDML disabled because object is not decorated with
parallel clause
---------------------------------------------
Operation | Name
---------------------------------------------
INSERT STATEMENT |
PX COORDINATOR |
PX SEND QC (RANDOM) | :TQ1000
LOAD AS SELECT (HYBRID TSM/HWMB)| T1
OPTIMIZER STATISTICS GATHERING |
PX BLOCK ITERATOR |
TABLE ACCESS FULL | T2
---------------------------------------------
SELECT * FROM v$pq_sesstat WHERE statistic like 'DML%';
STATISTIC LAST_QUERY SESSION_TOTAL CON_ID
------------------------------ ---------- ------------- ----------
DML Parallelized 1 3 0
How to ensure that PDML is used
• Statement level or object level PARALLEL hint in INSERT
• Forcing PDML in a session
• Auto DOP
• Parallel clause object decoration :
ALTER SESSION FORCE PARALLEL DML;
CREATE TABLE t_copy (…) PARALLEL;
ALTER TABLE t_copy PARALLEL;
INSERT /*+ parallel */ INTO t_copy t SELECT * FROM t_src;
INSERT /*+ parallel(t) */ INTO t_copy t SELECT * FROM t_src;
ALTER SESSION SET parallel_degree_policy = AUTO;
How to ensure that PDML is used (2)
• Refer to the Table „Parallelization Priority Order“
• But test your ETL scenario!
• In case of doubt, statement level hints have the highest priority
Restrictions preventing PDML
• No PDML on tables with triggers
• No PDML with enabled foreign keys. Use Reliable FK-constraints: valuable for CBO, but not
disruptive for ETL (RELY DISABLE NOVALIDATE). Exception: reference partitioning!
• Not enough parallel server
• Parallel DML is not supported on a table with bitmap indexes if the table is not partitioned.
IMPORTANT: For Partition Exchange Loading (PEL) don’t create any indexes on temporary table
before loading it!
Restrictions preventing PDML (2)
• Distributed transactions, DML on remote DB.
• Documentation 12.2 states:
• Indeed, this seems to work but doesn’t really make sense because DB link is always serial
SQL> insert /*+ enable_parallel_dml parallel */
into t_sdoc
select v.* from V_SDOC@remote_db V
2929218 rows created.
SQL> select * from v$pq_sesstat where statistic like 'DML%'
STATISTIC LAST_QUERY SESSION_TOTAL CON_ID
------------------- ---------- ------------- ----------
DML Parallelized 1 5 0
1 row selected.
-------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 |
| 3 | LOAD AS SELECT (HYBRID TSM/HWMB)| |
| 4 | OPTIMIZER STATISTICS GATHERING | |
| 5 | PX RECEIVE | |
| 6 | PX SEND ROUND-ROBIN | :TQ10000 |
| 7 | REMOTE | V_SDOC |
-------------------------------------------------------
Implications of PDML
• PX-coordinator and each PX-Server are working in their own transactions
• The coordinator uses a two-phase commit then
• Hence, the user transaction is in a special mode
• The results of parallel modifications cannot be seen in the same transaction
• Complex ETL processes relying on transaction integrity could be a problem: no PDML can be used for
intermediate steps.
• The same error for serial direct path INSERT though, so you cannot use it as a reliable check of PDML being
used
SQL> select count(*) from t_sdoc
Error at line 0
ORA-12838: cannot read/modify an object after modifying it in parallel
Space Management
with PDML
Space Management with PDML
• Multiple concurrent transactions are modifying the same object
• What to consider doing Parallel Direct Path Insert?
• Can this lead to excessive extent allocation or tablespace fragmentation?
• It is helpful to have an idea of what happens behind the scenes.
• Fortunately, Oracle 12c makes more information visible
--------------------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------------------
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 |
| 3 | LOAD AS SELECT (HYBRID TSM/HWMB)| T_COPY_PARALLEL |
| 4 | OPTIMIZER STATISTICS GATHERING | |
| 5 | PX BLOCK ITERATOR | |
| 6 | TABLE ACCESS FULL | T_SRC |
--------------------------------------------------------------
Uniform_TBS
Table1
• Tablespace with uniform extent size
• The unused space is inside the
extent
• Internal fragmentation
• Full Table Scans will scan this free
space too
• This free space can be used by
conventional inserts
• But doing PDML-Insert (direct path)
starts to fill a new extent every time
Uniform vs. System-Allocated Extents
All extents are equally sized
Unused space is „inside“
Autoallocate_TBS
Table1
Uniform vs. System-Allocated Extents
• Autoallocate
• 64K, 1M, 8M, 64M (8k block size)
• If free space is left after loading
(> min extent), extent trimming
happens and this free space is
returned back to the tablespace
• External fragmentation: free space is
not continuous and can potentially
be reused if smaller extents are
requested
8M 64M
8M 8M 8M
7M
Different extent sizes
Extents can be trimmed
1M
TBS
Table1
High Water Mark Loading (HWM)
• The server process has exclusive
access to the segment (table or
partition) and can insert into extents
above the HWM
• After commit the HWM is moved
and new data becomes visible
• Serial or parallel load with PKEY
distribution
Server Process
TBS
Table1
Temp Segment Merge (TSM) Loading
• Each PX Server is assigned and
populating its own temporary
segment
• Last extents can be trimmed
• Temp segments reside in the same
tablespace and are merged into the
target table by manipulating the
extent map on commit
• Very scalable but at least one extent
per PX-server
• Fragmentation possible because of
trimming
• In 12c rarely used when creating
partitioned tables
PX Slave PX Slave
Temp Segment Temp Segment
TBS
Table1
Temp Segment Merge (TSM) Loading
• Each PX Server is assigned and
populating its own temporary
segment
• Last extents can be trimmed
• Temp segments reside in the same
tablespace and are merged into the
target table by manipulating the
extent map on commit
• Very scalable but at least one extent
per PX-server
• Fragmentation possible because of
trimming
• In 12c rarely used when creating
partitioned tables
PX Slave PX Slave
TBS
Table1
High Water Mark Brokering (HWMB)
• Multiple PX servers may insert into
the same extent above the HWM,
which should then be “brokered”
• The brokering is implemented via
HV enqueue
• Results in fewer extents
• But less scalable
• Good for loading non-partitioned
tables or single partitions
PX Slave PX Slave
HV
Enqueue
RAC Instance 2RAC Instance 1
TBS
Table1
High Water Mark Brokering (HWMB)
• Scalability can become an issue with
high DOP, especially in a RAC
environment
PX Slave PX Slave
HV
Enqueue
PX Slave PX Slave
RAC Instance 2RAC Instance 1
Hybrid TSM/HWMB
• New in 12.1
• Each temporary segment has its own
HV enqueue which is only used by
local PX servers in case of RAC
• Fewer extents
• Improved scalability
PX Slave PX SlavePX Slave PX Slave
HV Enqueue HV Enqueue
TBS
Table1
Temp Segment Temp Segment
Data Loading Distribution
Data Loading Distribution
• Example:
• Join two equipartitioned tables T_SRC2 and T_SRC3
• Hash-Partitioned, 64 partitions
• 32 millions rows
INSERT /*+ append parallel */
INTO t_tgt_join t0 (OWNER, OBJECT_TYPE, OBJECT_NAME, LVL, FILLER)
SELECT t1.OWNER, t2.OBJECT_TYPE, t2.OBJECT_NAME, t1.LVL, t1.filler
FROM t_src3 t1 JOIN t_src2 t2
ON ( t1.OWNER = t2.OWNER AND t1.OBJECT_NAME = t2.OBJECT_NAME
AND t1.OBJECT_TYPE = t2.OBJECT_TYPE AND t1.lvl = t2.lvl);
Data Loading Distribution
• An example of joining two tables in
parallel
• Which PX Servers are actually
loading the result table?
• The same ones that are doing the
join?
• Another PX set? Should the data
then be redistributed again?
• It is where data loading distribution
matters
T1 T2
P001 P002
P003 P004
PX set reading T1,T2
and redistributing
PX set joining T1,T2
?
Data Loading Distribution
• Since 11.2 the hint PQ_DISTRIBUTE can be used to control load distribution
• NONE – no distribution, load is performed by the same PX-Servers
• PARTITION – distribution based on partitioning of target table
• RANDOM – round-robin distribution, useful for highly skewed data
• RANDOM_LOCAL – round-robin for PX servers on the same RAC instance
Data Loading Distribution - PARTITION
INSERT /*+ append parallel pq_distribute (t0 partition) */
INTO t_tgt_join t0
SELECT /*+ pq_distribute (t2 none none) */ t1…, t2…
FROM t_src3 t1 JOIN t_src2 t2
ON ( ...);
---------------------------------------------------------------
| Id | Operation | Name | TQ |
---------------------------------------------------------------
| 0 | INSERT STATEMENT | | |
| 1 | PX COORDINATOR | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 |
| 3 | LOAD AS SELECT (HIGH WATER MARK)| | Q1,01 |
| 4 | OPTIMIZER STATISTICS GATHERING | | Q1,01 |
| 5 | PX RECEIVE | | Q1,01 |
| 6 | PX SEND PARTITION (KEY) | :TQ10000 | Q1,00 |
| 7 | PX PARTITION HASH ALL | | Q1,00 |
|* 8 | HASH JOIN | | Q1,00 |
| 9 | TABLE ACCESS FULL | T_SRC2 | Q1,00 |
| 10 | TABLE ACCESS FULL | T_SRC3 | Q1,00 |
---------------------------------------------------------------
Data Loading Distribution - NONE
INSERT /*+ append parallel pq_distribute (t0 none) */
INTO t_tgt_join t0
SELECT /*+ pq_distribute (t2 none none) */ t1…, t2…
FROM t_src3 t1 JOIN t_src2 t2
ON ( ...);
--------------------------------------------------------------------
| Id | Operation | Name | TQ |
--------------------------------------------------------------------
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10000| Q1,00
| 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,00
| 4 | OPTIMIZER STATISTICS GATHERING | | Q1,00
| 5 | PX PARTITION HASH ALL | | Q1,00
|* 6 | HASH JOIN | | Q1,00
| 7 | TABLE ACCESS FULL | T_SRC2 | Q1,00
| 8 | TABLE ACCESS FULL | T_SRC3 | Q1,00
--------------------------------------------------------------------
Data Loading Distribution - RANDOM
INSERT /*+ append parallel pq_distribute (t0 random) */
INTO t_tgt_join t0
SELECT /*+ pq_distribute (t2 none none) */ t1…, t2…
FROM t_src3 t1 JOIN t_src2 t2
ON ( ...);
---------------------------------------------------------------------
| Id | Operation | Name | TQ
---------------------------------------------------------------------
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01
| 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,01
| 4 | OPTIMIZER STATISTICS GATHERING | | Q1,01
| 5 | PX RECEIVE | | Q1,01
| 6 | PX SEND ROUND-ROBIN | :TQ10000 | Q1,00
| 7 | PX PARTITION HASH ALL | | Q1,00
|* 8 | HASH JOIN | | Q1,00
| 9 | TABLE ACCESS FULL | T_SRC2 | Q1,00
| 10 | TABLE ACCESS FULL | T_SRC3 | Q1,00
---------------------------------------------------------------------
Data Loading Distribution - RANDOM
INSERT /*+ append parallel pq_distribute (t0 random) */
INTO t_tgt_join t0
SELECT t1…, t2…
FROM t_src3 t1 JOIN t_src2 t2 ON ( ...);
----------------------------------------------------------------------
| Id | Operation | Name | TQ
----------------------------------------------------------------------
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03
| 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,03
| 4 | OPTIMIZER STATISTICS GATHERING | | Q1,03
| 5 | PX RECEIVE | | Q1,03
| 6 | PX SEND ROUND-ROBIN | :TQ10002 | Q1,02
|* 7 | HASH JOIN BUFFERED | | Q1,02
| 8 | PART JOIN FILTER CREATE | :BF0000 | Q1,02
| 9 | PX RECEIVE | | Q1,02
| 10 | PX SEND HYBRID HASH | :TQ10000 | Q1,00
| 11 | STATISTICS COLLECTOR | | Q1,00
| 12 | PX BLOCK ITERATOR | | Q1,00
|*13 | TABLE ACCESS FULL | T_SRC2 | Q1,00
| 14 | PX RECEIVE | | Q1,02
| 15 | PX SEND HYBRID HASH | :TQ10001 | Q1,01
| 16 | PX BLOCK ITERATOR | | Q1,01
|*17 | TABLE ACCESS FULL | T_SRC3 | Q1,01
----------------------------------------------------------------------
Data Loading Distribution – No PWJ, No
Redistribution
INSERT /*+ append parallel pq_distribute (t0 none) */
INTO t_tgt_join t0
SELECT t1…, t2…
FROM t_src3 t1 JOIN t_src2 t2 ON ( ...);
---------------------------------------------------------------------
| Id | Operation | Name | TQ
---------------------------------------------------------------------
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 |Q1,02
| 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| |Q1,02
| 4 | OPTIMIZER STATISTICS GATHERING | |Q1,02
|* 5 | HASH JOIN | |Q1,02
| 6 | PART JOIN FILTER CREATE | :BF0000 |Q1,02
| 7 | PX RECEIVE | |Q1,02
| 8 | PX SEND HYBRID HASH | :TQ10000 |Q1,00
| 9 | STATISTICS COLLECTOR | |Q1,00
| 10 | PX BLOCK ITERATOR | |Q1,00
|*11 | TABLE ACCESS FULL | T_SRC2 |Q1,00
| 12 | PX RECEIVE | |Q1,02
| 13 | PX SEND HYBRID HASH | :TQ10001 |Q1,01
| 14 | PX BLOCK ITERATOR | |Q1,01
|*15 | TABLE ACCESS FULL | T_SRC3 |Q1,01
---------------------------------------------------------------------
Data Loading Distribution
• But in the presence of an
index the hint is ignored!
• Even if the index is
unusable
• The distribution is
needed again and is
causing a buffered hash
join
• High Water Mark (HWM)
because of the exclusive
access to the segment
CREATE BITMAP INDEX t_idx_tgt on t_tgt_join (OWNER) LOCAL PARALLEL;
INSERT /*+ append parallel pq_distribute (t0 none) */
...
| 0 | INSERT STATEMENT | |
| 1 | PX COORDINATOR | |
| 2 | PX SEND QC (RANDOM) | :TQ10004 | Q1,04
| 3 | INDEX MAINTENANCE | T_TGT_JOIN | Q1,04
| 4 | PX RECEIVE | | Q1,04
| 5 | PX SEND RANGE | :TQ10003 | Q1,03
| 6 | LOAD AS SELECT (HIGH WATER MARK)| | Q1,03
| 7 | OPTIMIZER STATISTICS GATHERING | | Q1,03
| 8 | PX RECEIVE | | Q1,03
| 9 | PX SEND PARTITION (KEY) | :TQ10002 | Q1,02
|*10 | HASH JOIN BUFFERED | | Q1,02
| 11 | PART JOIN FILTER CREATE | :BF0000 | Q1,02
| 12 | PX RECEIVE | | Q1,02
| 13 | PX SEND HYBRID HASH | :TQ10000 | Q1,00
| 14 | STATISTICS COLLECTOR | | Q1,00
| 15 | PX BLOCK ITERATOR | | Q1,00
|*16 | TABLE ACCESS FULL | T_SRC2 | Q1,00
| 17 | PX RECEIVE | | Q1,02
| 18 | PX SEND HYBRID HASH | :TQ10001 | Q1,01
| 19 | PX BLOCK ITERATOR | | Q1,01
|*20 | TABLE ACCESS FULL | T_SRC3 | Q1,01
-----------------------------------------------------------------
Differences with MERGE
Space Management with PDML and MERGE?
• Extents after first delta loading (~ 3%) with MERGE and INSERT
SQL> MERGE /*+ append parallel*/
2 INTO t_tgt_join t0
3 USING ( SELECT ...
----------------------------------------
| Id | Operation |
----------------------------------------
| 0 | MERGE STATEMENT |
| 1 | PX COORDINATOR |
| 2 | PX SEND QC (RANDOM) |
| 3 | MERGE |
| 4 | PX RECEIVE |
SEGMENT_NAME BLOCKS CNT
------------ ------ -------
T_TGT_JOIN 8 2113
... 13 rows ...
T_TGT_JOIN 128 4713
... 20 rows ...
T_TGT_JOIN 1024 34
36 rows selected.
SQL> INSERT /*+ append parallel */
2 INTO t_tgt_join t0
3 SELECT ...
--------------------------------------------------
|Id | Operation
--------------------------------------------------
| 0 | INSERT STATEMENT
| 1 | PX COORDINATOR
| 2 | PX SEND QC (RANDOM)
| 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)
| 4 | OPTIMIZER STATISTICS GATHERING
SEGMENT_NAME BLOCKS CNT
------------ ---------- ---------
T_TGT_JOIN 8 1024
T_TGT_JOIN 128 4248
... 6 rows ...
T_TGT_JOIN 1024 139
9 rows selected.1154 new
extents!
60 new
extents!
MERGE
• Basically, if PDML is turned on in a session and for particular statement, MERGE will
parallelize both the INSERT and UPDATE operations
• But there are some differences:
• No space management decoration is reported in the execution plan
• Even worse, it always seems to run as Temp Segment Merge.
• Significantly more extents are created
• Many of them are trimmed
• Every load operation starts again with many 64K extents
• Maybe it’s worth thinking about providing INITIAL and NEXT even for Autoallocate
tablespace
• Avoid MERGE if you don’t really need it (for example you materialize temporary results
anyway like ODI SCD Type 2 Knowledge Module does and could then update and insert in
two parallel operations).
Summary
• Don’t overuse PDML. Turn it on only selectively where it makes sense
• Be careful and double check that your statements are doing PDML
• Oracle reports the space management strategy for LOAD AS SELECT operations in
execution plans from 12.1.0.2, but not for MERGE operations
• Bloating extent map will have a negative effect on the parallel queries
• From 12c Oracle has introduced Hybrid TSM/HWMB which increases scalability but keeps
extent number small
• Don’t create indexes on tables for partition exchange, they can significantly influence
the execution plan. Bitmap indexes will even disable PDML!
• For the most critical loading processes check data distribution which you can influence
with PQ_DISTRIBUTE hint
• If using MERGE for critical ETL, check the space management behavior
Links
• Oracle Documentation, VLDB Guide, About Parallel DML Operations
• Nigel Bayliss, Space Management with PDML
• Randolf Geist, Understanding Parallel Execution - Part 1 and Part 2
• Randolf Geist, Hash Join Buffered
• Timur Akhmadeev, PQ_DISTRIBUTE Enhancement
• Jonathan Lewis, Autoallocate and PX

More Related Content

What's hot

Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo... Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Enkitec
 
Exadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold StoryExadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold Story
Enkitec
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
Simon Huang
 
What's new in Oracle 19c & 18c Recovery Manager (RMAN)
What's new in Oracle 19c & 18c Recovery Manager (RMAN)What's new in Oracle 19c & 18c Recovery Manager (RMAN)
What's new in Oracle 19c & 18c Recovery Manager (RMAN)
Satishbabu Gunukula
 

What's hot (20)

DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentalsDB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsOracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
 
The Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - PresentationThe Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - Presentation
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance Tuning
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explained
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata Migrations
 
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIESORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
 
Oracle Database Performance Tuning Concept
Oracle Database Performance Tuning ConceptOracle Database Performance Tuning Concept
Oracle Database Performance Tuning Concept
 
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo... Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the OptimizerPart1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the Optimizer
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19c
 
Oracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONOracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLON
 
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
 
Exadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold StoryExadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold Story
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
 
AWR and ASH Deep Dive
AWR and ASH Deep DiveAWR and ASH Deep Dive
AWR and ASH Deep Dive
 
What's new in Oracle 19c & 18c Recovery Manager (RMAN)
What's new in Oracle 19c & 18c Recovery Manager (RMAN)What's new in Oracle 19c & 18c Recovery Manager (RMAN)
What's new in Oracle 19c & 18c Recovery Manager (RMAN)
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmea
 

Similar to Properly Use Parallel DML for ETL

Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-features
Navneet Upneja
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
Dharmesh Tank
 
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSIDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
Cuneyt Goksu
 
COUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesCOUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_Features
Alfredo Abate
 

Similar to Properly Use Parallel DML for ETL (20)

Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-features
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprises
 
[Altibase] 12 replication part5 (optimization and monitoring)
[Altibase] 12 replication part5 (optimization and monitoring)[Altibase] 12 replication part5 (optimization and monitoring)
[Altibase] 12 replication part5 (optimization and monitoring)
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
OOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with ParallelOOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with Parallel
 
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSIDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
 
Reduce planned database down time with Oracle technology
Reduce planned database down time with Oracle technologyReduce planned database down time with Oracle technology
Reduce planned database down time with Oracle technology
 
6.3 Mload.pdf
6.3 Mload.pdf6.3 Mload.pdf
6.3 Mload.pdf
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
 
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive MetastoreOracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
 
COUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesCOUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_Features
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
 
Powering GIS Application with PostgreSQL and Postgres Plus
Powering GIS Application with PostgreSQL and Postgres Plus Powering GIS Application with PostgreSQL and Postgres Plus
Powering GIS Application with PostgreSQL and Postgres Plus
 
lect13_programmable_dp.pptx
lect13_programmable_dp.pptxlect13_programmable_dp.pptx
lect13_programmable_dp.pptx
 

More from Andrej Pashchenko

More from Andrej Pashchenko (8)

MERGE SQL Statement: Lesser Known Facets
MERGE SQL Statement: Lesser Known FacetsMERGE SQL Statement: Lesser Known Facets
MERGE SQL Statement: Lesser Known Facets
 
SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18c
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18c
 
Online Statistics Gathering for ETL
Online Statistics Gathering for ETLOnline Statistics Gathering for ETL
Online Statistics Gathering for ETL
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?
 
Pure SQL for batch processing
Pure SQL for batch processingPure SQL for batch processing
Pure SQL for batch processing
 
An unconventional approach for ETL of historized data
An unconventional approach for ETL of historized dataAn unconventional approach for ETL of historized data
An unconventional approach for ETL of historized data
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Recently uploaded (20)

%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 

Properly Use Parallel DML for ETL

  • 1. blog.sqlora.com@Andrej_SQL Properly Use Parallel DML for your ETL Andrej Pashchenko
  • 2. About me • Working at Trivadis, Düsseldorf • Focusing on Oracle: • Data Warehousing • Application Development • Application Performance • Course instructor „Oracle New Features for Developers“ @Andrej_SQL blog.sqlora.com
  • 3.
  • 4. Parallel Processing in Oracle DB Parallel Processing Parallel Query Parallel DDL Parallel DML SELECT • CTAS • CREATE INDEX • ALTER TABLE MOVE • … • Parallel IAS • Parallel MERGE • Parallel UPDATE • Parallel DELETE
  • 6. How to enable PDML • Parallel Query and Parallel DDL are enabled by default • Parallel DML has to be enabled first at system or session level: • In 12c it is also possible with a hint at statement level : • Issue with the hint: hard parse on every execution, caution with plan stability • But enabling PDML doesn’t yet mean a parallel execution plan will be used ALTER SESSION ENABLE PARALLEL DML; INSERT /*+ enable_parallel_dml parallel append */ INTO sales SELECT /*+ parallel */ * FROM sales_v;
  • 7. How do I know PDML was used? • Check the position of DML, e.g. LOAD AS SELECT, with respect to query coordinator • Check the note • Check v$pq_sesstat --------------------------------------------- Operation | Name --------------------------------------------- INSERT STATEMENT | LOAD AS SELECT | T1 PX COORDINATOR | PX SEND QC (RANDOM) | :TQ1000 OPTIMIZER STATISTICS GATHERING | PX BLOCK ITERATOR | TABLE ACCESS FULL | T2 --------------------------------------------- Note - PDML disabled because object is not decorated with parallel clause --------------------------------------------- Operation | Name --------------------------------------------- INSERT STATEMENT | PX COORDINATOR | PX SEND QC (RANDOM) | :TQ1000 LOAD AS SELECT (HYBRID TSM/HWMB)| T1 OPTIMIZER STATISTICS GATHERING | PX BLOCK ITERATOR | TABLE ACCESS FULL | T2 --------------------------------------------- SELECT * FROM v$pq_sesstat WHERE statistic like 'DML%'; STATISTIC LAST_QUERY SESSION_TOTAL CON_ID ------------------------------ ---------- ------------- ---------- DML Parallelized 1 3 0
  • 8. How to ensure that PDML is used • Statement level or object level PARALLEL hint in INSERT • Forcing PDML in a session • Auto DOP • Parallel clause object decoration : ALTER SESSION FORCE PARALLEL DML; CREATE TABLE t_copy (…) PARALLEL; ALTER TABLE t_copy PARALLEL; INSERT /*+ parallel */ INTO t_copy t SELECT * FROM t_src; INSERT /*+ parallel(t) */ INTO t_copy t SELECT * FROM t_src; ALTER SESSION SET parallel_degree_policy = AUTO;
  • 9. How to ensure that PDML is used (2) • Refer to the Table „Parallelization Priority Order“ • But test your ETL scenario! • In case of doubt, statement level hints have the highest priority
  • 10. Restrictions preventing PDML • No PDML on tables with triggers • No PDML with enabled foreign keys. Use Reliable FK-constraints: valuable for CBO, but not disruptive for ETL (RELY DISABLE NOVALIDATE). Exception: reference partitioning! • Not enough parallel server • Parallel DML is not supported on a table with bitmap indexes if the table is not partitioned. IMPORTANT: For Partition Exchange Loading (PEL) don’t create any indexes on temporary table before loading it!
  • 11. Restrictions preventing PDML (2) • Distributed transactions, DML on remote DB. • Documentation 12.2 states: • Indeed, this seems to work but doesn’t really make sense because DB link is always serial SQL> insert /*+ enable_parallel_dml parallel */ into t_sdoc select v.* from V_SDOC@remote_db V 2929218 rows created. SQL> select * from v$pq_sesstat where statistic like 'DML%' STATISTIC LAST_QUERY SESSION_TOTAL CON_ID ------------------- ---------- ------------- ---------- DML Parallelized 1 5 0 1 row selected. ------------------------------------------------------- | Id | Operation | Name | ------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | | 3 | LOAD AS SELECT (HYBRID TSM/HWMB)| | | 4 | OPTIMIZER STATISTICS GATHERING | | | 5 | PX RECEIVE | | | 6 | PX SEND ROUND-ROBIN | :TQ10000 | | 7 | REMOTE | V_SDOC | -------------------------------------------------------
  • 12. Implications of PDML • PX-coordinator and each PX-Server are working in their own transactions • The coordinator uses a two-phase commit then • Hence, the user transaction is in a special mode • The results of parallel modifications cannot be seen in the same transaction • Complex ETL processes relying on transaction integrity could be a problem: no PDML can be used for intermediate steps. • The same error for serial direct path INSERT though, so you cannot use it as a reliable check of PDML being used SQL> select count(*) from t_sdoc Error at line 0 ORA-12838: cannot read/modify an object after modifying it in parallel
  • 14. Space Management with PDML • Multiple concurrent transactions are modifying the same object • What to consider doing Parallel Direct Path Insert? • Can this lead to excessive extent allocation or tablespace fragmentation? • It is helpful to have an idea of what happens behind the scenes. • Fortunately, Oracle 12c makes more information visible -------------------------------------------------------------- | Id | Operation | Name | -------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | | 3 | LOAD AS SELECT (HYBRID TSM/HWMB)| T_COPY_PARALLEL | | 4 | OPTIMIZER STATISTICS GATHERING | | | 5 | PX BLOCK ITERATOR | | | 6 | TABLE ACCESS FULL | T_SRC | --------------------------------------------------------------
  • 15. Uniform_TBS Table1 • Tablespace with uniform extent size • The unused space is inside the extent • Internal fragmentation • Full Table Scans will scan this free space too • This free space can be used by conventional inserts • But doing PDML-Insert (direct path) starts to fill a new extent every time Uniform vs. System-Allocated Extents All extents are equally sized Unused space is „inside“
  • 16. Autoallocate_TBS Table1 Uniform vs. System-Allocated Extents • Autoallocate • 64K, 1M, 8M, 64M (8k block size) • If free space is left after loading (> min extent), extent trimming happens and this free space is returned back to the tablespace • External fragmentation: free space is not continuous and can potentially be reused if smaller extents are requested 8M 64M 8M 8M 8M 7M Different extent sizes Extents can be trimmed 1M
  • 17. TBS Table1 High Water Mark Loading (HWM) • The server process has exclusive access to the segment (table or partition) and can insert into extents above the HWM • After commit the HWM is moved and new data becomes visible • Serial or parallel load with PKEY distribution Server Process
  • 18. TBS Table1 Temp Segment Merge (TSM) Loading • Each PX Server is assigned and populating its own temporary segment • Last extents can be trimmed • Temp segments reside in the same tablespace and are merged into the target table by manipulating the extent map on commit • Very scalable but at least one extent per PX-server • Fragmentation possible because of trimming • In 12c rarely used when creating partitioned tables PX Slave PX Slave Temp Segment Temp Segment
  • 19. TBS Table1 Temp Segment Merge (TSM) Loading • Each PX Server is assigned and populating its own temporary segment • Last extents can be trimmed • Temp segments reside in the same tablespace and are merged into the target table by manipulating the extent map on commit • Very scalable but at least one extent per PX-server • Fragmentation possible because of trimming • In 12c rarely used when creating partitioned tables PX Slave PX Slave
  • 20. TBS Table1 High Water Mark Brokering (HWMB) • Multiple PX servers may insert into the same extent above the HWM, which should then be “brokered” • The brokering is implemented via HV enqueue • Results in fewer extents • But less scalable • Good for loading non-partitioned tables or single partitions PX Slave PX Slave HV Enqueue
  • 21. RAC Instance 2RAC Instance 1 TBS Table1 High Water Mark Brokering (HWMB) • Scalability can become an issue with high DOP, especially in a RAC environment PX Slave PX Slave HV Enqueue PX Slave PX Slave
  • 22. RAC Instance 2RAC Instance 1 Hybrid TSM/HWMB • New in 12.1 • Each temporary segment has its own HV enqueue which is only used by local PX servers in case of RAC • Fewer extents • Improved scalability PX Slave PX SlavePX Slave PX Slave HV Enqueue HV Enqueue TBS Table1 Temp Segment Temp Segment
  • 24. Data Loading Distribution • Example: • Join two equipartitioned tables T_SRC2 and T_SRC3 • Hash-Partitioned, 64 partitions • 32 millions rows INSERT /*+ append parallel */ INTO t_tgt_join t0 (OWNER, OBJECT_TYPE, OBJECT_NAME, LVL, FILLER) SELECT t1.OWNER, t2.OBJECT_TYPE, t2.OBJECT_NAME, t1.LVL, t1.filler FROM t_src3 t1 JOIN t_src2 t2 ON ( t1.OWNER = t2.OWNER AND t1.OBJECT_NAME = t2.OBJECT_NAME AND t1.OBJECT_TYPE = t2.OBJECT_TYPE AND t1.lvl = t2.lvl);
  • 25. Data Loading Distribution • An example of joining two tables in parallel • Which PX Servers are actually loading the result table? • The same ones that are doing the join? • Another PX set? Should the data then be redistributed again? • It is where data loading distribution matters T1 T2 P001 P002 P003 P004 PX set reading T1,T2 and redistributing PX set joining T1,T2 ?
  • 26. Data Loading Distribution • Since 11.2 the hint PQ_DISTRIBUTE can be used to control load distribution • NONE – no distribution, load is performed by the same PX-Servers • PARTITION – distribution based on partitioning of target table • RANDOM – round-robin distribution, useful for highly skewed data • RANDOM_LOCAL – round-robin for PX servers on the same RAC instance
  • 27. Data Loading Distribution - PARTITION INSERT /*+ append parallel pq_distribute (t0 partition) */ INTO t_tgt_join t0 SELECT /*+ pq_distribute (t2 none none) */ t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); --------------------------------------------------------------- | Id | Operation | Name | TQ | --------------------------------------------------------------- | 0 | INSERT STATEMENT | | | | 1 | PX COORDINATOR | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | | 3 | LOAD AS SELECT (HIGH WATER MARK)| | Q1,01 | | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,01 | | 5 | PX RECEIVE | | Q1,01 | | 6 | PX SEND PARTITION (KEY) | :TQ10000 | Q1,00 | | 7 | PX PARTITION HASH ALL | | Q1,00 | |* 8 | HASH JOIN | | Q1,00 | | 9 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | | 10 | TABLE ACCESS FULL | T_SRC3 | Q1,00 | ---------------------------------------------------------------
  • 28. Data Loading Distribution - NONE INSERT /*+ append parallel pq_distribute (t0 none) */ INTO t_tgt_join t0 SELECT /*+ pq_distribute (t2 none none) */ t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); -------------------------------------------------------------------- | Id | Operation | Name | TQ | -------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10000| Q1,00 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,00 | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,00 | 5 | PX PARTITION HASH ALL | | Q1,00 |* 6 | HASH JOIN | | Q1,00 | 7 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 8 | TABLE ACCESS FULL | T_SRC3 | Q1,00 --------------------------------------------------------------------
  • 29. Data Loading Distribution - RANDOM INSERT /*+ append parallel pq_distribute (t0 random) */ INTO t_tgt_join t0 SELECT /*+ pq_distribute (t2 none none) */ t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); --------------------------------------------------------------------- | Id | Operation | Name | TQ --------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,01 | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,01 | 5 | PX RECEIVE | | Q1,01 | 6 | PX SEND ROUND-ROBIN | :TQ10000 | Q1,00 | 7 | PX PARTITION HASH ALL | | Q1,00 |* 8 | HASH JOIN | | Q1,00 | 9 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 10 | TABLE ACCESS FULL | T_SRC3 | Q1,00 ---------------------------------------------------------------------
  • 30. Data Loading Distribution - RANDOM INSERT /*+ append parallel pq_distribute (t0 random) */ INTO t_tgt_join t0 SELECT t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); ---------------------------------------------------------------------- | Id | Operation | Name | TQ ---------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,03 | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,03 | 5 | PX RECEIVE | | Q1,03 | 6 | PX SEND ROUND-ROBIN | :TQ10002 | Q1,02 |* 7 | HASH JOIN BUFFERED | | Q1,02 | 8 | PART JOIN FILTER CREATE | :BF0000 | Q1,02 | 9 | PX RECEIVE | | Q1,02 | 10 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | 11 | STATISTICS COLLECTOR | | Q1,00 | 12 | PX BLOCK ITERATOR | | Q1,00 |*13 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 14 | PX RECEIVE | | Q1,02 | 15 | PX SEND HYBRID HASH | :TQ10001 | Q1,01 | 16 | PX BLOCK ITERATOR | | Q1,01 |*17 | TABLE ACCESS FULL | T_SRC3 | Q1,01 ----------------------------------------------------------------------
  • 31. Data Loading Distribution – No PWJ, No Redistribution INSERT /*+ append parallel pq_distribute (t0 none) */ INTO t_tgt_join t0 SELECT t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); --------------------------------------------------------------------- | Id | Operation | Name | TQ --------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10002 |Q1,02 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| |Q1,02 | 4 | OPTIMIZER STATISTICS GATHERING | |Q1,02 |* 5 | HASH JOIN | |Q1,02 | 6 | PART JOIN FILTER CREATE | :BF0000 |Q1,02 | 7 | PX RECEIVE | |Q1,02 | 8 | PX SEND HYBRID HASH | :TQ10000 |Q1,00 | 9 | STATISTICS COLLECTOR | |Q1,00 | 10 | PX BLOCK ITERATOR | |Q1,00 |*11 | TABLE ACCESS FULL | T_SRC2 |Q1,00 | 12 | PX RECEIVE | |Q1,02 | 13 | PX SEND HYBRID HASH | :TQ10001 |Q1,01 | 14 | PX BLOCK ITERATOR | |Q1,01 |*15 | TABLE ACCESS FULL | T_SRC3 |Q1,01 ---------------------------------------------------------------------
  • 32. Data Loading Distribution • But in the presence of an index the hint is ignored! • Even if the index is unusable • The distribution is needed again and is causing a buffered hash join • High Water Mark (HWM) because of the exclusive access to the segment CREATE BITMAP INDEX t_idx_tgt on t_tgt_join (OWNER) LOCAL PARALLEL; INSERT /*+ append parallel pq_distribute (t0 none) */ ... | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10004 | Q1,04 | 3 | INDEX MAINTENANCE | T_TGT_JOIN | Q1,04 | 4 | PX RECEIVE | | Q1,04 | 5 | PX SEND RANGE | :TQ10003 | Q1,03 | 6 | LOAD AS SELECT (HIGH WATER MARK)| | Q1,03 | 7 | OPTIMIZER STATISTICS GATHERING | | Q1,03 | 8 | PX RECEIVE | | Q1,03 | 9 | PX SEND PARTITION (KEY) | :TQ10002 | Q1,02 |*10 | HASH JOIN BUFFERED | | Q1,02 | 11 | PART JOIN FILTER CREATE | :BF0000 | Q1,02 | 12 | PX RECEIVE | | Q1,02 | 13 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | 14 | STATISTICS COLLECTOR | | Q1,00 | 15 | PX BLOCK ITERATOR | | Q1,00 |*16 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 17 | PX RECEIVE | | Q1,02 | 18 | PX SEND HYBRID HASH | :TQ10001 | Q1,01 | 19 | PX BLOCK ITERATOR | | Q1,01 |*20 | TABLE ACCESS FULL | T_SRC3 | Q1,01 -----------------------------------------------------------------
  • 34. Space Management with PDML and MERGE? • Extents after first delta loading (~ 3%) with MERGE and INSERT SQL> MERGE /*+ append parallel*/ 2 INTO t_tgt_join t0 3 USING ( SELECT ... ---------------------------------------- | Id | Operation | ---------------------------------------- | 0 | MERGE STATEMENT | | 1 | PX COORDINATOR | | 2 | PX SEND QC (RANDOM) | | 3 | MERGE | | 4 | PX RECEIVE | SEGMENT_NAME BLOCKS CNT ------------ ------ ------- T_TGT_JOIN 8 2113 ... 13 rows ... T_TGT_JOIN 128 4713 ... 20 rows ... T_TGT_JOIN 1024 34 36 rows selected. SQL> INSERT /*+ append parallel */ 2 INTO t_tgt_join t0 3 SELECT ... -------------------------------------------------- |Id | Operation -------------------------------------------------- | 0 | INSERT STATEMENT | 1 | PX COORDINATOR | 2 | PX SEND QC (RANDOM) | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED) | 4 | OPTIMIZER STATISTICS GATHERING SEGMENT_NAME BLOCKS CNT ------------ ---------- --------- T_TGT_JOIN 8 1024 T_TGT_JOIN 128 4248 ... 6 rows ... T_TGT_JOIN 1024 139 9 rows selected.1154 new extents! 60 new extents!
  • 35. MERGE • Basically, if PDML is turned on in a session and for particular statement, MERGE will parallelize both the INSERT and UPDATE operations • But there are some differences: • No space management decoration is reported in the execution plan • Even worse, it always seems to run as Temp Segment Merge. • Significantly more extents are created • Many of them are trimmed • Every load operation starts again with many 64K extents • Maybe it’s worth thinking about providing INITIAL and NEXT even for Autoallocate tablespace • Avoid MERGE if you don’t really need it (for example you materialize temporary results anyway like ODI SCD Type 2 Knowledge Module does and could then update and insert in two parallel operations).
  • 36. Summary • Don’t overuse PDML. Turn it on only selectively where it makes sense • Be careful and double check that your statements are doing PDML • Oracle reports the space management strategy for LOAD AS SELECT operations in execution plans from 12.1.0.2, but not for MERGE operations • Bloating extent map will have a negative effect on the parallel queries • From 12c Oracle has introduced Hybrid TSM/HWMB which increases scalability but keeps extent number small • Don’t create indexes on tables for partition exchange, they can significantly influence the execution plan. Bitmap indexes will even disable PDML! • For the most critical loading processes check data distribution which you can influence with PQ_DISTRIBUTE hint • If using MERGE for critical ETL, check the space management behavior
  • 37. Links • Oracle Documentation, VLDB Guide, About Parallel DML Operations • Nigel Bayliss, Space Management with PDML • Randolf Geist, Understanding Parallel Execution - Part 1 and Part 2 • Randolf Geist, Hash Join Buffered • Timur Akhmadeev, PQ_DISTRIBUTE Enhancement • Jonathan Lewis, Autoallocate and PX