SlideShare a Scribd company logo

Properly Use Parallel DML for ETL

It is no secret that for high-performance ETL processes, not only queries but also write operations should be parallelized. But when you make use of it, is it simply "switch on and forget"? What do you have to consider? Can it also have negative effects? After a short reminder on how it works (including space management methods), some patterns are presented that have been noticed in several ETL review and tuning projects and help to find the answers to the following questions: What is the interaction between PDML and partitioning of the target table? Can PDML lead to increased fragmentation of the tablespace? Can you control it? How does the Hint PQ_DISTRIBUTE help? Do indexes on the target table have any influence?

1 of 37
Download to read offline
blog.sqlora.com@Andrej_SQL
Properly Use Parallel DML for your ETL
Andrej Pashchenko
About me
• Working at Trivadis, Düsseldorf
• Focusing on Oracle:
• Data Warehousing
• Application Development
• Application Performance
• Course instructor „Oracle New Features for
Developers“
@Andrej_SQL blog.sqlora.com
Properly Use Parallel DML for ETL
Parallel Processing in Oracle DB
Parallel
Processing
Parallel Query Parallel DDL Parallel DML
SELECT
• CTAS
• CREATE INDEX
• ALTER TABLE MOVE
• …
• Parallel IAS
• Parallel MERGE
• Parallel UPDATE
• Parallel DELETE
Controlling,
Restrictions
and Implications
How to enable PDML
• Parallel Query and Parallel DDL are enabled by default
• Parallel DML has to be enabled first at system or session level:
• In 12c it is also possible with a hint at statement level :
• Issue with the hint: hard parse on every execution, caution with plan stability
• But enabling PDML doesn’t yet mean a parallel execution plan will be used
ALTER SESSION ENABLE PARALLEL DML;
INSERT /*+ enable_parallel_dml parallel append */
INTO sales
SELECT /*+ parallel */ * FROM sales_v;

Recommended

Oracle Database in-Memory Overivew
Oracle Database in-Memory OverivewOracle Database in-Memory Overivew
Oracle Database in-Memory OverivewMaria Colgan
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningBobby Curtis
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder
 
Extreme Replication - Performance Tuning Oracle GoldenGate
Extreme Replication - Performance Tuning Oracle GoldenGateExtreme Replication - Performance Tuning Oracle GoldenGate
Extreme Replication - Performance Tuning Oracle GoldenGateBobby Curtis
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
 
Indexing in Exadata
Indexing in ExadataIndexing in Exadata
Indexing in ExadataEnkitec
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle MultitenantJitendra Singh
 

More Related Content

What's hot

The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACMarkus Michalewicz
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsZohar Elkayam
 
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo... Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...Enkitec
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals   part 2 - slidesUnderstanding oracle rac internals   part 2 - slides
Understanding oracle rac internals part 2 - slidesMohamed Farouk
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Aaron Shilo
 
Christo kutrovsky oracle, memory & linux
Christo kutrovsky   oracle, memory & linuxChristo kutrovsky   oracle, memory & linux
Christo kutrovsky oracle, memory & linuxKyle Hailey
 
Performance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and UnderscoresPerformance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and UnderscoresJitendra Singh
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateBobby Curtis
 
Same plan different performance
Same plan different performanceSame plan different performance
Same plan different performanceMauro Pagano
 
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An OverviewOracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An OverviewMarkus Michalewicz
 
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdfOracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdfSrirakshaSrinivasan2
 
Exadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold StoryExadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold StoryEnkitec
 
Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0Olivier DASINI
 
Understand oracle real application cluster
Understand oracle real application clusterUnderstand oracle real application cluster
Understand oracle real application clusterSatishbabu Gunukula
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesBobby Curtis
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsSandesh Rao
 
Oracle RAC - New Generation
Oracle RAC - New GenerationOracle RAC - New Generation
Oracle RAC - New GenerationAnil Nair
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19cMaria Colgan
 

What's hot (20)

The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
 
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo... Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals   part 2 - slidesUnderstanding oracle rac internals   part 2 - slides
Understanding oracle rac internals part 2 - slides
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
 
Christo kutrovsky oracle, memory & linux
Christo kutrovsky   oracle, memory & linuxChristo kutrovsky   oracle, memory & linux
Christo kutrovsky oracle, memory & linux
 
Performance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and UnderscoresPerformance Stability, Tips and Tricks and Underscores
Performance Stability, Tips and Tricks and Underscores
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGate
 
Same plan different performance
Same plan different performanceSame plan different performance
Same plan different performance
 
One PDB to go, please!
One PDB to go, please!One PDB to go, please!
One PDB to go, please!
 
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An OverviewOracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
 
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdfOracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
 
Exadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold StoryExadata and the Oracle Optimizer: The Untold Story
Exadata and the Oracle Optimizer: The Untold Story
 
Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0
 
Understand oracle real application cluster
Understand oracle real application clusterUnderstand oracle real application cluster
Understand oracle real application cluster
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best Practices
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata Environments
 
Part5 sql tune
Part5 sql tunePart5 sql tune
Part5 sql tune
 
Oracle RAC - New Generation
Oracle RAC - New GenerationOracle RAC - New Generation
Oracle RAC - New Generation
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19c
 

Similar to Properly Use Parallel DML for ETL

Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAiougVizagChapter
 
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Andrejs Karpovs
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-featuresNavneet Upneja
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain MultithreadingDharmesh Tank
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprisesNelson Calero
 
[Altibase] 12 replication part5 (optimization and monitoring)
[Altibase] 12 replication part5 (optimization and monitoring)[Altibase] 12 replication part5 (optimization and monitoring)
[Altibase] 12 replication part5 (optimization and monitoring)altistory
 
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSIDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSCuneyt Goksu
 
Reduce planned database down time with Oracle technology
Reduce planned database down time with Oracle technologyReduce planned database down time with Oracle technology
Reduce planned database down time with Oracle technologyKirill Loifman
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...Alex Zaballa
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...Alex Zaballa
 
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Rajesh Kannan S
 
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive MetastoreOracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive MetastoreDataWorks Summit
 
COUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesCOUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesAlfredo Abate
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1sunildupakuntla
 
Powering GIS Application with PostgreSQL and Postgres Plus
Powering GIS Application with PostgreSQL and Postgres Plus Powering GIS Application with PostgreSQL and Postgres Plus
Powering GIS Application with PostgreSQL and Postgres Plus Ashnikbiz
 

Similar to Properly Use Parallel DML for ETL (20)

Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-features
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprises
 
[Altibase] 12 replication part5 (optimization and monitoring)
[Altibase] 12 replication part5 (optimization and monitoring)[Altibase] 12 replication part5 (optimization and monitoring)
[Altibase] 12 replication part5 (optimization and monitoring)
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
OOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with ParallelOOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with Parallel
 
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSIDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
 
Reduce planned database down time with Oracle technology
Reduce planned database down time with Oracle technologyReduce planned database down time with Oracle technology
Reduce planned database down time with Oracle technology
 
6.3 Mload.pdf
6.3 Mload.pdf6.3 Mload.pdf
6.3 Mload.pdf
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
Slashn Talk OLTP in Supply Chain - Handling Super-scale and Change Propagatio...
 
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive MetastoreOracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
 
COUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesCOUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_Features
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
 
Powering GIS Application with PostgreSQL and Postgres Plus
Powering GIS Application with PostgreSQL and Postgres Plus Powering GIS Application with PostgreSQL and Postgres Plus
Powering GIS Application with PostgreSQL and Postgres Plus
 
lect13_programmable_dp.pptx
lect13_programmable_dp.pptxlect13_programmable_dp.pptx
lect13_programmable_dp.pptx
 

More from Andrej Pashchenko

MERGE SQL Statement: Lesser Known Facets
MERGE SQL Statement: Lesser Known FacetsMERGE SQL Statement: Lesser Known Facets
MERGE SQL Statement: Lesser Known FacetsAndrej Pashchenko
 
SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?Andrej Pashchenko
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18cAndrej Pashchenko
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18cAndrej Pashchenko
 
Online Statistics Gathering for ETL
Online Statistics Gathering for ETLOnline Statistics Gathering for ETL
Online Statistics Gathering for ETLAndrej Pashchenko
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?Andrej Pashchenko
 
Pure SQL for batch processing
Pure SQL for batch processingPure SQL for batch processing
Pure SQL for batch processingAndrej Pashchenko
 
An unconventional approach for ETL of historized data
An unconventional approach for ETL of historized dataAn unconventional approach for ETL of historized data
An unconventional approach for ETL of historized dataAndrej Pashchenko
 

More from Andrej Pashchenko (8)

MERGE SQL Statement: Lesser Known Facets
MERGE SQL Statement: Lesser Known FacetsMERGE SQL Statement: Lesser Known Facets
MERGE SQL Statement: Lesser Known Facets
 
SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18c
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18c
 
Online Statistics Gathering for ETL
Online Statistics Gathering for ETLOnline Statistics Gathering for ETL
Online Statistics Gathering for ETL
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?
 
Pure SQL for batch processing
Pure SQL for batch processingPure SQL for batch processing
Pure SQL for batch processing
 
An unconventional approach for ETL of historized data
An unconventional approach for ETL of historized dataAn unconventional approach for ETL of historized data
An unconventional approach for ETL of historized data
 

Recently uploaded

Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Jeffrey Haguewood
 
The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!ISPMAIndia
 
OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20Shane Coughlan
 
killingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfkillingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfssuser82c38d
 
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이ssuser82c38d
 
Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...
Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...
Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...ISPMAIndia
 
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...emili denli
 
Agile & Scrum, Certified Scrum Master! Crash Course
Agile & Scrum,  Certified Scrum Master! Crash CourseAgile & Scrum,  Certified Scrum Master! Crash Course
Agile & Scrum, Certified Scrum Master! Crash CourseRohan Chandane
 
LLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowLLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowNaoki (Neo) SATO
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
Role of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxRole of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxMindInventory
 
killing camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfkilling camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfssuser82c38d
 
Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!alttaskcom
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
SPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementSPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementISPMAIndia
 
Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Dmitry Zinoviev
 
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ..."Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...ISPMAIndia
 
Joseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureJoseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureHironori Washizaki
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!Anthony Dahanne
 

Recently uploaded (20)

eLearning Content Development Company Code and Pixels.pdf
eLearning Content Development Company Code and Pixels.pdfeLearning Content Development Company Code and Pixels.pdf
eLearning Content Development Company Code and Pixels.pdf
 
Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)
 
The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!
 
OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20
 
killingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfkillingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdf
 
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
 
Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...
Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...
Product Manager vs Product Owner – Why Do Companies Still Struggle 23 Years A...
 
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
 
Agile & Scrum, Certified Scrum Master! Crash Course
Agile & Scrum,  Certified Scrum Master! Crash CourseAgile & Scrum,  Certified Scrum Master! Crash Course
Agile & Scrum, Certified Scrum Master! Crash Course
 
LLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowLLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flow
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
Role of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxRole of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptx
 
killing camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfkilling camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdf
 
Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
SPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementSPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product Management
 
Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)
 
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ..."Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
 
Joseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureJoseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about Architecture
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!
 

Properly Use Parallel DML for ETL

  • 1. blog.sqlora.com@Andrej_SQL Properly Use Parallel DML for your ETL Andrej Pashchenko
  • 2. About me • Working at Trivadis, Düsseldorf • Focusing on Oracle: • Data Warehousing • Application Development • Application Performance • Course instructor „Oracle New Features for Developers“ @Andrej_SQL blog.sqlora.com
  • 4. Parallel Processing in Oracle DB Parallel Processing Parallel Query Parallel DDL Parallel DML SELECT • CTAS • CREATE INDEX • ALTER TABLE MOVE • … • Parallel IAS • Parallel MERGE • Parallel UPDATE • Parallel DELETE
  • 6. How to enable PDML • Parallel Query and Parallel DDL are enabled by default • Parallel DML has to be enabled first at system or session level: • In 12c it is also possible with a hint at statement level : • Issue with the hint: hard parse on every execution, caution with plan stability • But enabling PDML doesn’t yet mean a parallel execution plan will be used ALTER SESSION ENABLE PARALLEL DML; INSERT /*+ enable_parallel_dml parallel append */ INTO sales SELECT /*+ parallel */ * FROM sales_v;
  • 7. How do I know PDML was used? • Check the position of DML, e.g. LOAD AS SELECT, with respect to query coordinator • Check the note • Check v$pq_sesstat --------------------------------------------- Operation | Name --------------------------------------------- INSERT STATEMENT | LOAD AS SELECT | T1 PX COORDINATOR | PX SEND QC (RANDOM) | :TQ1000 OPTIMIZER STATISTICS GATHERING | PX BLOCK ITERATOR | TABLE ACCESS FULL | T2 --------------------------------------------- Note - PDML disabled because object is not decorated with parallel clause --------------------------------------------- Operation | Name --------------------------------------------- INSERT STATEMENT | PX COORDINATOR | PX SEND QC (RANDOM) | :TQ1000 LOAD AS SELECT (HYBRID TSM/HWMB)| T1 OPTIMIZER STATISTICS GATHERING | PX BLOCK ITERATOR | TABLE ACCESS FULL | T2 --------------------------------------------- SELECT * FROM v$pq_sesstat WHERE statistic like 'DML%'; STATISTIC LAST_QUERY SESSION_TOTAL CON_ID ------------------------------ ---------- ------------- ---------- DML Parallelized 1 3 0
  • 8. How to ensure that PDML is used • Statement level or object level PARALLEL hint in INSERT • Forcing PDML in a session • Auto DOP • Parallel clause object decoration : ALTER SESSION FORCE PARALLEL DML; CREATE TABLE t_copy (…) PARALLEL; ALTER TABLE t_copy PARALLEL; INSERT /*+ parallel */ INTO t_copy t SELECT * FROM t_src; INSERT /*+ parallel(t) */ INTO t_copy t SELECT * FROM t_src; ALTER SESSION SET parallel_degree_policy = AUTO;
  • 9. How to ensure that PDML is used (2) • Refer to the Table „Parallelization Priority Order“ • But test your ETL scenario! • In case of doubt, statement level hints have the highest priority
  • 10. Restrictions preventing PDML • No PDML on tables with triggers • No PDML with enabled foreign keys. Use Reliable FK-constraints: valuable for CBO, but not disruptive for ETL (RELY DISABLE NOVALIDATE). Exception: reference partitioning! • Not enough parallel server • Parallel DML is not supported on a table with bitmap indexes if the table is not partitioned. IMPORTANT: For Partition Exchange Loading (PEL) don’t create any indexes on temporary table before loading it!
  • 11. Restrictions preventing PDML (2) • Distributed transactions, DML on remote DB. • Documentation 12.2 states: • Indeed, this seems to work but doesn’t really make sense because DB link is always serial SQL> insert /*+ enable_parallel_dml parallel */ into t_sdoc select v.* from V_SDOC@remote_db V 2929218 rows created. SQL> select * from v$pq_sesstat where statistic like 'DML%' STATISTIC LAST_QUERY SESSION_TOTAL CON_ID ------------------- ---------- ------------- ---------- DML Parallelized 1 5 0 1 row selected. ------------------------------------------------------- | Id | Operation | Name | ------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | | 3 | LOAD AS SELECT (HYBRID TSM/HWMB)| | | 4 | OPTIMIZER STATISTICS GATHERING | | | 5 | PX RECEIVE | | | 6 | PX SEND ROUND-ROBIN | :TQ10000 | | 7 | REMOTE | V_SDOC | -------------------------------------------------------
  • 12. Implications of PDML • PX-coordinator and each PX-Server are working in their own transactions • The coordinator uses a two-phase commit then • Hence, the user transaction is in a special mode • The results of parallel modifications cannot be seen in the same transaction • Complex ETL processes relying on transaction integrity could be a problem: no PDML can be used for intermediate steps. • The same error for serial direct path INSERT though, so you cannot use it as a reliable check of PDML being used SQL> select count(*) from t_sdoc Error at line 0 ORA-12838: cannot read/modify an object after modifying it in parallel
  • 14. Space Management with PDML • Multiple concurrent transactions are modifying the same object • What to consider doing Parallel Direct Path Insert? • Can this lead to excessive extent allocation or tablespace fragmentation? • It is helpful to have an idea of what happens behind the scenes. • Fortunately, Oracle 12c makes more information visible -------------------------------------------------------------- | Id | Operation | Name | -------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | | 3 | LOAD AS SELECT (HYBRID TSM/HWMB)| T_COPY_PARALLEL | | 4 | OPTIMIZER STATISTICS GATHERING | | | 5 | PX BLOCK ITERATOR | | | 6 | TABLE ACCESS FULL | T_SRC | --------------------------------------------------------------
  • 15. Uniform_TBS Table1 • Tablespace with uniform extent size • The unused space is inside the extent • Internal fragmentation • Full Table Scans will scan this free space too • This free space can be used by conventional inserts • But doing PDML-Insert (direct path) starts to fill a new extent every time Uniform vs. System-Allocated Extents All extents are equally sized Unused space is „inside“
  • 16. Autoallocate_TBS Table1 Uniform vs. System-Allocated Extents • Autoallocate • 64K, 1M, 8M, 64M (8k block size) • If free space is left after loading (> min extent), extent trimming happens and this free space is returned back to the tablespace • External fragmentation: free space is not continuous and can potentially be reused if smaller extents are requested 8M 64M 8M 8M 8M 7M Different extent sizes Extents can be trimmed 1M
  • 17. TBS Table1 High Water Mark Loading (HWM) • The server process has exclusive access to the segment (table or partition) and can insert into extents above the HWM • After commit the HWM is moved and new data becomes visible • Serial or parallel load with PKEY distribution Server Process
  • 18. TBS Table1 Temp Segment Merge (TSM) Loading • Each PX Server is assigned and populating its own temporary segment • Last extents can be trimmed • Temp segments reside in the same tablespace and are merged into the target table by manipulating the extent map on commit • Very scalable but at least one extent per PX-server • Fragmentation possible because of trimming • In 12c rarely used when creating partitioned tables PX Slave PX Slave Temp Segment Temp Segment
  • 19. TBS Table1 Temp Segment Merge (TSM) Loading • Each PX Server is assigned and populating its own temporary segment • Last extents can be trimmed • Temp segments reside in the same tablespace and are merged into the target table by manipulating the extent map on commit • Very scalable but at least one extent per PX-server • Fragmentation possible because of trimming • In 12c rarely used when creating partitioned tables PX Slave PX Slave
  • 20. TBS Table1 High Water Mark Brokering (HWMB) • Multiple PX servers may insert into the same extent above the HWM, which should then be “brokered” • The brokering is implemented via HV enqueue • Results in fewer extents • But less scalable • Good for loading non-partitioned tables or single partitions PX Slave PX Slave HV Enqueue
  • 21. RAC Instance 2RAC Instance 1 TBS Table1 High Water Mark Brokering (HWMB) • Scalability can become an issue with high DOP, especially in a RAC environment PX Slave PX Slave HV Enqueue PX Slave PX Slave
  • 22. RAC Instance 2RAC Instance 1 Hybrid TSM/HWMB • New in 12.1 • Each temporary segment has its own HV enqueue which is only used by local PX servers in case of RAC • Fewer extents • Improved scalability PX Slave PX SlavePX Slave PX Slave HV Enqueue HV Enqueue TBS Table1 Temp Segment Temp Segment
  • 24. Data Loading Distribution • Example: • Join two equipartitioned tables T_SRC2 and T_SRC3 • Hash-Partitioned, 64 partitions • 32 millions rows INSERT /*+ append parallel */ INTO t_tgt_join t0 (OWNER, OBJECT_TYPE, OBJECT_NAME, LVL, FILLER) SELECT t1.OWNER, t2.OBJECT_TYPE, t2.OBJECT_NAME, t1.LVL, t1.filler FROM t_src3 t1 JOIN t_src2 t2 ON ( t1.OWNER = t2.OWNER AND t1.OBJECT_NAME = t2.OBJECT_NAME AND t1.OBJECT_TYPE = t2.OBJECT_TYPE AND t1.lvl = t2.lvl);
  • 25. Data Loading Distribution • An example of joining two tables in parallel • Which PX Servers are actually loading the result table? • The same ones that are doing the join? • Another PX set? Should the data then be redistributed again? • It is where data loading distribution matters T1 T2 P001 P002 P003 P004 PX set reading T1,T2 and redistributing PX set joining T1,T2 ?
  • 26. Data Loading Distribution • Since 11.2 the hint PQ_DISTRIBUTE can be used to control load distribution • NONE – no distribution, load is performed by the same PX-Servers • PARTITION – distribution based on partitioning of target table • RANDOM – round-robin distribution, useful for highly skewed data • RANDOM_LOCAL – round-robin for PX servers on the same RAC instance
  • 27. Data Loading Distribution - PARTITION INSERT /*+ append parallel pq_distribute (t0 partition) */ INTO t_tgt_join t0 SELECT /*+ pq_distribute (t2 none none) */ t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); --------------------------------------------------------------- | Id | Operation | Name | TQ | --------------------------------------------------------------- | 0 | INSERT STATEMENT | | | | 1 | PX COORDINATOR | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | | 3 | LOAD AS SELECT (HIGH WATER MARK)| | Q1,01 | | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,01 | | 5 | PX RECEIVE | | Q1,01 | | 6 | PX SEND PARTITION (KEY) | :TQ10000 | Q1,00 | | 7 | PX PARTITION HASH ALL | | Q1,00 | |* 8 | HASH JOIN | | Q1,00 | | 9 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | | 10 | TABLE ACCESS FULL | T_SRC3 | Q1,00 | ---------------------------------------------------------------
  • 28. Data Loading Distribution - NONE INSERT /*+ append parallel pq_distribute (t0 none) */ INTO t_tgt_join t0 SELECT /*+ pq_distribute (t2 none none) */ t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); -------------------------------------------------------------------- | Id | Operation | Name | TQ | -------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10000| Q1,00 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,00 | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,00 | 5 | PX PARTITION HASH ALL | | Q1,00 |* 6 | HASH JOIN | | Q1,00 | 7 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 8 | TABLE ACCESS FULL | T_SRC3 | Q1,00 --------------------------------------------------------------------
  • 29. Data Loading Distribution - RANDOM INSERT /*+ append parallel pq_distribute (t0 random) */ INTO t_tgt_join t0 SELECT /*+ pq_distribute (t2 none none) */ t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); --------------------------------------------------------------------- | Id | Operation | Name | TQ --------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,01 | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,01 | 5 | PX RECEIVE | | Q1,01 | 6 | PX SEND ROUND-ROBIN | :TQ10000 | Q1,00 | 7 | PX PARTITION HASH ALL | | Q1,00 |* 8 | HASH JOIN | | Q1,00 | 9 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 10 | TABLE ACCESS FULL | T_SRC3 | Q1,00 ---------------------------------------------------------------------
  • 30. Data Loading Distribution - RANDOM INSERT /*+ append parallel pq_distribute (t0 random) */ INTO t_tgt_join t0 SELECT t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); ---------------------------------------------------------------------- | Id | Operation | Name | TQ ---------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| | Q1,03 | 4 | OPTIMIZER STATISTICS GATHERING | | Q1,03 | 5 | PX RECEIVE | | Q1,03 | 6 | PX SEND ROUND-ROBIN | :TQ10002 | Q1,02 |* 7 | HASH JOIN BUFFERED | | Q1,02 | 8 | PART JOIN FILTER CREATE | :BF0000 | Q1,02 | 9 | PX RECEIVE | | Q1,02 | 10 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | 11 | STATISTICS COLLECTOR | | Q1,00 | 12 | PX BLOCK ITERATOR | | Q1,00 |*13 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 14 | PX RECEIVE | | Q1,02 | 15 | PX SEND HYBRID HASH | :TQ10001 | Q1,01 | 16 | PX BLOCK ITERATOR | | Q1,01 |*17 | TABLE ACCESS FULL | T_SRC3 | Q1,01 ----------------------------------------------------------------------
  • 31. Data Loading Distribution – No PWJ, No Redistribution INSERT /*+ append parallel pq_distribute (t0 none) */ INTO t_tgt_join t0 SELECT t1…, t2… FROM t_src3 t1 JOIN t_src2 t2 ON ( ...); --------------------------------------------------------------------- | Id | Operation | Name | TQ --------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10002 |Q1,02 | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| |Q1,02 | 4 | OPTIMIZER STATISTICS GATHERING | |Q1,02 |* 5 | HASH JOIN | |Q1,02 | 6 | PART JOIN FILTER CREATE | :BF0000 |Q1,02 | 7 | PX RECEIVE | |Q1,02 | 8 | PX SEND HYBRID HASH | :TQ10000 |Q1,00 | 9 | STATISTICS COLLECTOR | |Q1,00 | 10 | PX BLOCK ITERATOR | |Q1,00 |*11 | TABLE ACCESS FULL | T_SRC2 |Q1,00 | 12 | PX RECEIVE | |Q1,02 | 13 | PX SEND HYBRID HASH | :TQ10001 |Q1,01 | 14 | PX BLOCK ITERATOR | |Q1,01 |*15 | TABLE ACCESS FULL | T_SRC3 |Q1,01 ---------------------------------------------------------------------
  • 32. Data Loading Distribution • But in the presence of an index the hint is ignored! • Even if the index is unusable • The distribution is needed again and is causing a buffered hash join • High Water Mark (HWM) because of the exclusive access to the segment CREATE BITMAP INDEX t_idx_tgt on t_tgt_join (OWNER) LOCAL PARALLEL; INSERT /*+ append parallel pq_distribute (t0 none) */ ... | 0 | INSERT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10004 | Q1,04 | 3 | INDEX MAINTENANCE | T_TGT_JOIN | Q1,04 | 4 | PX RECEIVE | | Q1,04 | 5 | PX SEND RANGE | :TQ10003 | Q1,03 | 6 | LOAD AS SELECT (HIGH WATER MARK)| | Q1,03 | 7 | OPTIMIZER STATISTICS GATHERING | | Q1,03 | 8 | PX RECEIVE | | Q1,03 | 9 | PX SEND PARTITION (KEY) | :TQ10002 | Q1,02 |*10 | HASH JOIN BUFFERED | | Q1,02 | 11 | PART JOIN FILTER CREATE | :BF0000 | Q1,02 | 12 | PX RECEIVE | | Q1,02 | 13 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | 14 | STATISTICS COLLECTOR | | Q1,00 | 15 | PX BLOCK ITERATOR | | Q1,00 |*16 | TABLE ACCESS FULL | T_SRC2 | Q1,00 | 17 | PX RECEIVE | | Q1,02 | 18 | PX SEND HYBRID HASH | :TQ10001 | Q1,01 | 19 | PX BLOCK ITERATOR | | Q1,01 |*20 | TABLE ACCESS FULL | T_SRC3 | Q1,01 -----------------------------------------------------------------
  • 34. Space Management with PDML and MERGE? • Extents after first delta loading (~ 3%) with MERGE and INSERT SQL> MERGE /*+ append parallel*/ 2 INTO t_tgt_join t0 3 USING ( SELECT ... ---------------------------------------- | Id | Operation | ---------------------------------------- | 0 | MERGE STATEMENT | | 1 | PX COORDINATOR | | 2 | PX SEND QC (RANDOM) | | 3 | MERGE | | 4 | PX RECEIVE | SEGMENT_NAME BLOCKS CNT ------------ ------ ------- T_TGT_JOIN 8 2113 ... 13 rows ... T_TGT_JOIN 128 4713 ... 20 rows ... T_TGT_JOIN 1024 34 36 rows selected. SQL> INSERT /*+ append parallel */ 2 INTO t_tgt_join t0 3 SELECT ... -------------------------------------------------- |Id | Operation -------------------------------------------------- | 0 | INSERT STATEMENT | 1 | PX COORDINATOR | 2 | PX SEND QC (RANDOM) | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED) | 4 | OPTIMIZER STATISTICS GATHERING SEGMENT_NAME BLOCKS CNT ------------ ---------- --------- T_TGT_JOIN 8 1024 T_TGT_JOIN 128 4248 ... 6 rows ... T_TGT_JOIN 1024 139 9 rows selected.1154 new extents! 60 new extents!
  • 35. MERGE • Basically, if PDML is turned on in a session and for particular statement, MERGE will parallelize both the INSERT and UPDATE operations • But there are some differences: • No space management decoration is reported in the execution plan • Even worse, it always seems to run as Temp Segment Merge. • Significantly more extents are created • Many of them are trimmed • Every load operation starts again with many 64K extents • Maybe it’s worth thinking about providing INITIAL and NEXT even for Autoallocate tablespace • Avoid MERGE if you don’t really need it (for example you materialize temporary results anyway like ODI SCD Type 2 Knowledge Module does and could then update and insert in two parallel operations).
  • 36. Summary • Don’t overuse PDML. Turn it on only selectively where it makes sense • Be careful and double check that your statements are doing PDML • Oracle reports the space management strategy for LOAD AS SELECT operations in execution plans from 12.1.0.2, but not for MERGE operations • Bloating extent map will have a negative effect on the parallel queries • From 12c Oracle has introduced Hybrid TSM/HWMB which increases scalability but keeps extent number small • Don’t create indexes on tables for partition exchange, they can significantly influence the execution plan. Bitmap indexes will even disable PDML! • For the most critical loading processes check data distribution which you can influence with PQ_DISTRIBUTE hint • If using MERGE for critical ETL, check the space management behavior
  • 37. Links • Oracle Documentation, VLDB Guide, About Parallel DML Operations • Nigel Bayliss, Space Management with PDML • Randolf Geist, Understanding Parallel Execution - Part 1 and Part 2 • Randolf Geist, Hash Join Buffered • Timur Akhmadeev, PQ_DISTRIBUTE Enhancement • Jonathan Lewis, Autoallocate and PX