SlideShare a Scribd company logo
1 of 44
 Independent consultant
 Performance Troubleshooting
 In-house workshops
 Cost-Based Optimizer
 Performance By Design
 Oracle ACE Director
 Member of OakTable Network
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
 Fixing work distribution skew
 Oracle Database Enterprise Edition includes
the powerful Parallel Execution feature that
allows spreading the processing of a single
SQL statement execution across multiple
worker processes
 The feature is fully integrated into the Cost
Based Optimizer as well as the execution
runtime engine and automatically distributes
the work across the so called Parallel Workers
 Simple generic parallelization example
Task: Compute sum of 8 numbers
1+8=9, 9+7=16, 16+9=25,...
1+8+7+9+6+2+6+3= ???
n=8 numbers, 7 computation steps required
Serial execution: 7 time units
Simple generic parallelization example
4 workers
3+ time units
1 + 8
= 9
9 + 7
= 16
6 + 2
= 8
6 + 3
= 9
9 + 16
= 25
8 + 9
= 17
25 + 17
= 42
Coordinator
Parallel Execution doesn’t mean “work smarter”
You’re actually willing to accept to “work harder”
Could also be called: “Brute force” approach
So with Parallel Execution there
might be the problem that it
doesn’t work “hard enough”
 Two major challenges
 Can the given task be divided into sub-tasks that can
efficiently and independently be processed by the
workers? (“Parallel Unfriendly”)
 Can all assigned workers be kept busy all the time?
 More reasons why Oracle Parallel Execution
might not reduce runtime as expected:
 Parallel DML/DDL gotchas
 “Downgrade” at execution time (less workers
assigned than expected)
 Overhead of Parallel Execution implementation
 Limitations of Parallel Execution implementation
Parallel DML / DDL gotchas
 DML / DDL part can run parallel or serial
 Query part can run parallel or serial
Parallel CTAS but serial query
-------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T4 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | Q1,01 | PCWP | |
| 5 | PX SEND ROUND-ROBIN| :TQ10000 | | S->P | RND-ROBIN |
|* 6 | HASH JOIN | | | | |
| 7 | TABLE ACCESS FULL| T2 | | | |
| 8 | TABLE ACCESS FULL| T2 | | | |
-------------------------------------------------------------------------
Serial CTAS but parallel query
--------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | LOAD AS SELECT | T4 | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |
|* 4 | HASH JOIN BUFFERED | | Q1,02 | PCWP | |
| 5 | PX RECEIVE | | Q1,02 | PCWP | |
| 6 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH |
| 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 8 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
| 9 | PX RECEIVE | | Q1,02 | PCWP | |
| 10 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 11 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 12 | TABLE ACCESS FULL| T2 | Q1,01 | PCWP | |
--------------------------------------------------------------------------
 Other reasons why Oracle Parallel Execution
might not scale as expected:
 Parallel DML/DDL gotchas
 “Downgrade” at execution time (less workers
assigned than expected)
 Overhead of Parallel Execution implementation
 Limitations of Parallel Execution implementation
“Parallel Forced Serial” Example
------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR FORCED SERIAL| | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) |
| 3 | HASH UNIQUE | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | Q1,02 | P->P | HASH |
|* 6 | HASH JOIN BUFFERED | | Q1,02 | PCWP | |
| 7 | PX RECEIVE | | Q1,02 | PCWP | |
| 8 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH |
| 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 10 | TABLE ACCESS FULL | T2 | Q1,00 | PCWP | |
| 11 | PX RECEIVE | | Q1,02 | PCWP | |
| 12 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 14 | TABLE ACCESS FULL | T2 | Q1,01 | PCWP | |
------------------------------------------------------------------------------
 Two major challenges
 Can the given task be divided into sub-tasks that can
efficiently and independently be processed by the
workers? (“Parallel Unfriendly”)
 Can all assigned workers be kept busy all the time?
select median(id) from t2;
-----------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT GROUP BY | | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM)| :TQ10000 | Q1,00 | P->S | QC (RAND) |
| 4 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 5 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
-----------------------------------------------------------------------
create table t3 parallel
as
select * from t2
where rownum <= 10000000;
-----------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | |
| 4 | PX RECEIVE | | Q2,01 | PCWP | |
| 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN |
|* 6 | COUNT STOPKEY | | | | |
| 7 | PX COORDINATOR | | | | |
| 8 | PX SEND QC (RANDOM) | :TQ10000 | Q1,00 | P->S | QC (RAND) |
|* 9 | COUNT STOPKEY | | Q1,00 | PCWC | |
| 10 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 11 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------
create table t3 parallel
as select * from (select a.*,
lag(filler, 1) over (order by id) as prev_filler
from t2 a);
--------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | |
| 4 | PX RECEIVE | | Q2,01 | PCWP | |
| 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN |
| 6 | VIEW | | | | |
| 7 | WINDOW BUFFER | | | | |
| 8 | PX COORDINATOR | | | | |
| 9 | PX SEND QC (ORDER) | :TQ10001 | Q1,01 | P->S | QC (ORDER) |
| 10 | SORT ORDER BY | | Q1,01 | PCWP | |
| 11 | PX RECEIVE | | Q1,01 | PCWP | |
| 12 | PX SEND RANGE | :TQ10000 | Q1,00 | P->P | RANGE |
| 13 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 14 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------
All these examples have one thing in common:
If the Query Coordinator (non-parallel part)
needs to perform a significant part of the overall
work, Parallel Execution won’t reduce the
runtime as expected
 Two major challenges
 Can the given task be divided into sub-tasks that can
efficiently and independently be processed by the
workers? (“Parallel Unfriendly”)
 Can all assigned workers be kept busy all the time?
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
 Fixing work distribution skew
 Extended SQL tracing
 Generates one trace file per Parallel Worker process
in database server trace directory
 For cross-instance Parallel Execution this means files
spread across more than one server
 Each Parallel Worker trace file lists, among others,
the number of rows produced per plan line and
elapsed time
 Extended SQL tracing
Allows detecting data distribution skew
Usually requires to reproduce the issue
Tedious to collect and skim through dozens of trace
files (no tool known that automates that job)
 DBMS_XPLAN.DISPLAY_CURSOR +
Rowsource statistics
 Based on same internal code instrumentation as
extended SQL trace
 Feeds back rowsource statistics (rows produced on
execution plan line level, timing, I/O stats) into
Library Cache
 Very useful for analyzing serial executions
 DBMS_XPLAN.DISPLAY_CURSOR +
Rowsource statistics
Not enabled by default, need to reproduce
Doesn’t cope with well multiple Parallel Execution
Servers / more complex parallel plans or Cross
Instance RAC execution
No information per Parallel Execution Server
=> Therefore not able to detect distribution skew
 Analyzing Data Distribution Skew
 Oracle since a long time offers a special view called
V$PQ_TQSTAT
 It is populated after a successful Parallel Execution
in the session of the Query Coordinator
 It lists the amount of data send via the “Table
Queues” / “Virtual Tables”
 In theory this allows exact analysis which operations
caused an uneven data distribution
 V$PQ_TQSTAT skewed execution example
DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES PROCESS
---------- ---------- ---------- ---------- ---------- ----------
1 0 Producer 500807 54396692 P008
1 0 Producer 499193 54259853 P009
1 0 Consumer 500038 54332349 P010
1 0 Consumer 499962 54324196 P011
1 1 Producer 499826 57267724 P008
1 1 Producer 500174 57357253 P009
1 1 Consumer 499933 57304789 P010
1 1 Consumer 500067 57320188 P011
1 2 Producer 462069 52963939 P010
1 2 Producer 537931 61681312 P011
1 2 Consumer 500212 57346574 P008
1 2 Consumer 499788 57298677 P009
1 3 Producer 500038 57337041 P010
1 3 Producer 499962 57328456 P011
1 3 Consumer 0 48 P008
1 3 Consumer 1000000 114665449 P009
1 4 Producer 0 24 P008
1 4 Producer 1000000 116668401 P009
1 4 Consumer 1000000 116668425 QC
 V$PQ_TQSTAT
Allows detecting data distribution skew
Can only be queried from inside session
Doesn’t cope with well more complex parallel plans
No information about relevance of skew
Easy to identify whether all workers are kept
busy all the time or not
Easy to identify if there was a problem with
work distribution
Shows actual parallel degree used (“Parallel
Downgrade”)
Supports RAC
Reports are not persisted and will be flushed
from memory quite quickly on busy systems
No easy identification and therefore no
systematic troubleshooting which plan
operations cause a work distribution problem
Lacks some precision regarding Parallel
Execution details
 Real-Time SQL Monitoring allows detecting
Parallel Execution skew in the following way:
 In the “Activity” tab
 The average active sessions will be less than the DOP of
the operation, allows to detect both temporal and data
distribution skew
 In the “Parallel” tab
 The DB Time recorded per Parallel Slave will show an
uneven distribution of active work time
 Real-Time SQL Monitoring “Activity” tab –
skewed execution example
 Real-Time SQL Monitoring “Parallel” tab –
skewed execution example
 One very useful approach is using Active
Session History (ASH)
 ASH samples active sessions once a second
 Activity of Parallel Workers over time can
easily be analyzed
 From 11g on the ASH data even contains a
reference to the execution plan line, so a
relation between Parallel Worker activity and
execution plan line based on ASH is possible
 Custom queries on ASH data required for
detailed analysis
 XPLAN_ASH tool runs these queries for a
given SQL_ID
 Advantage of ASH is the availability of
retained historic ASH data via AWR on disk
 Information can be extracted even for SQL
executions as long ago as the retention
configured for AWR
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
 Fixing work distribution skew
Fixing Work Distribution Skew
 Influence Parallel Distribution: Data volume
estimates, PQ_DISTRIBUTE / PQ_[NO]MAP /
PQ_SKEW (12c+) hint
 Limit impact by influencing join order
 Rewrite queries trying to limit or avoid skew
 Remap skewed data changing the value distribution
Fixing Work Distribution Skew
 Automatic join skew detection in 12c (PQ_SKEW)
 Supports only parallel HASH JOINs
 Supports at present only simple probe row sources
(no result of other joins supported)
 Histogram showing popular values required (or
forced via PQ_SKEW hint)
----------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT AGGREGATE | | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | Q1,02 | PCWP | |
|* 5 | HASH JOIN | | Q1,02 | PCWP | |
| 6 | PX RECEIVE | | Q1,02 | PCWP | |
| 7 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | P->P | HYBRID HASH|
| 8 | STATISTICS COLLECTOR | | Q1,00 | PCWC | |
| 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
|* 10 | TABLE ACCESS FULL | T_1 | Q1,00 | PCWP | |
| 11 | PX RECEIVE | | Q1,02 | PCWP | |
| 12 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | Q1,01 | P->P | HYBRID HASH|
| 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
|* 14 | TABLE ACCESS FULL | T_2 | Q1,01 | PCWP | |
----------------------------------------------------------------------------------
Fixing Work Distribution Skew
 More information:
 http://oracle-
randolf.blogspot.com/2014/08/parallel-execution-
skew-summary.html
Q & A

More Related Content

Viewers also liked

Viewers also liked (12)

Branding & Corporate Identity Design for PLYTEC by Buzzworks
Branding & Corporate Identity Design for PLYTEC by BuzzworksBranding & Corporate Identity Design for PLYTEC by Buzzworks
Branding & Corporate Identity Design for PLYTEC by Buzzworks
 
DH Pedagogy at LINHD, UNED
DH Pedagogy at LINHD, UNEDDH Pedagogy at LINHD, UNED
DH Pedagogy at LINHD, UNED
 
Trabajo de etica de la diversida
Trabajo de etica de la diversidaTrabajo de etica de la diversida
Trabajo de etica de la diversida
 
ARLD Wessex datapack
ARLD Wessex datapackARLD Wessex datapack
ARLD Wessex datapack
 
Anderson scott ppp_1511_final
Anderson scott ppp_1511_finalAnderson scott ppp_1511_final
Anderson scott ppp_1511_final
 
PSEA Article - Above Beyond - WTHS
PSEA Article - Above  Beyond - WTHSPSEA Article - Above  Beyond - WTHS
PSEA Article - Above Beyond - WTHS
 
How to Setup a Market Cooperation
How to Setup a Market CooperationHow to Setup a Market Cooperation
How to Setup a Market Cooperation
 
第六课 - 男朋友女朋友
第六课 - 男朋友女朋友第六课 - 男朋友女朋友
第六课 - 男朋友女朋友
 
DAS BAUSYMPOSIUM 13 DBS
DAS BAUSYMPOSIUM 13 DBSDAS BAUSYMPOSIUM 13 DBS
DAS BAUSYMPOSIUM 13 DBS
 
Student Work - A. Richardson
Student Work - A. RichardsonStudent Work - A. Richardson
Student Work - A. Richardson
 
Mf0012–taxation management
Mf0012–taxation managementMf0012–taxation management
Mf0012–taxation management
 
Mu0013 hr audit
Mu0013 hr auditMu0013 hr audit
Mu0013 hr audit
 

Similar to Analysing and troubleshooting Parallel Execution IT Tage 2015

Parallel Execution With Oracle Database 12c - Masterclass
Parallel Execution With Oracle Database 12c - MasterclassParallel Execution With Oracle Database 12c - Masterclass
Parallel Execution With Oracle Database 12c - MasterclassIvica Arsov
 
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdfOracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf7vkx8892hv
 
Oracle Parallel Distribution and 12c Adaptive Plans
Oracle Parallel Distribution and 12c Adaptive PlansOracle Parallel Distribution and 12c Adaptive Plans
Oracle Parallel Distribution and 12c Adaptive PlansFranck Pachot
 
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by exampleMauro Pagano
 
Adaptive Query Optimization in 12c
Adaptive Query Optimization in 12cAdaptive Query Optimization in 12c
Adaptive Query Optimization in 12cAnju Garg
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performanceGuy Harrison
 
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersRiyaj Shamsudeen
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Oracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First TimeOracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First TimeDean Richards
 
Streaming Data from Scylla to Kafka
Streaming Data from Scylla to KafkaStreaming Data from Scylla to Kafka
Streaming Data from Scylla to KafkaScyllaDB
 
My Experience Using Oracle SQL Plan Baselines 11g/12c
My Experience Using Oracle SQL Plan Baselines 11g/12cMy Experience Using Oracle SQL Plan Baselines 11g/12c
My Experience Using Oracle SQL Plan Baselines 11g/12cNelson Calero
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleGuatemala User Group
 
The Magic of Window Functions in Postgres
The Magic of Window Functions in PostgresThe Magic of Window Functions in Postgres
The Magic of Window Functions in PostgresEDB
 
Change Data Capture in Scylla
Change Data Capture in ScyllaChange Data Capture in Scylla
Change Data Capture in ScyllaScyllaDB
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel ExecutionDoug Burns
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 

Similar to Analysing and troubleshooting Parallel Execution IT Tage 2015 (20)

Hash joins and bloom filters at AMIS25
Hash joins and bloom filters at AMIS25Hash joins and bloom filters at AMIS25
Hash joins and bloom filters at AMIS25
 
Parallel Execution With Oracle Database 12c - Masterclass
Parallel Execution With Oracle Database 12c - MasterclassParallel Execution With Oracle Database 12c - Masterclass
Parallel Execution With Oracle Database 12c - Masterclass
 
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdfOracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
 
Oracle Parallel Distribution and 12c Adaptive Plans
Oracle Parallel Distribution and 12c Adaptive PlansOracle Parallel Distribution and 12c Adaptive Plans
Oracle Parallel Distribution and 12c Adaptive Plans
 
Subquery factoring for FTS
Subquery factoring for FTSSubquery factoring for FTS
Subquery factoring for FTS
 
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by example
 
Adaptive Query Optimization in 12c
Adaptive Query Optimization in 12cAdaptive Query Optimization in 12c
Adaptive Query Optimization in 12c
 
Adaptive Query Optimization
Adaptive Query OptimizationAdaptive Query Optimization
Adaptive Query Optimization
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
 
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineers
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Oracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First TimeOracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First Time
 
Streaming Data from Scylla to Kafka
Streaming Data from Scylla to KafkaStreaming Data from Scylla to Kafka
Streaming Data from Scylla to Kafka
 
Do You Know The 11g Plan?
Do You Know The 11g Plan?Do You Know The 11g Plan?
Do You Know The 11g Plan?
 
My Experience Using Oracle SQL Plan Baselines 11g/12c
My Experience Using Oracle SQL Plan Baselines 11g/12cMy Experience Using Oracle SQL Plan Baselines 11g/12c
My Experience Using Oracle SQL Plan Baselines 11g/12c
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
 
The Magic of Window Functions in Postgres
The Magic of Window Functions in PostgresThe Magic of Window Functions in Postgres
The Magic of Window Functions in Postgres
 
Change Data Capture in Scylla
Change Data Capture in ScyllaChange Data Capture in Scylla
Change Data Capture in Scylla
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel Execution
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 

Recently uploaded

Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...marjmae69
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxCarrieButtitta
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxnoorehahmad
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !risocarla2016
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 

Recently uploaded (20)

Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptx
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 

Analysing and troubleshooting Parallel Execution IT Tage 2015

  • 1.
  • 2.  Independent consultant  Performance Troubleshooting  In-house workshops  Cost-Based Optimizer  Performance By Design  Oracle ACE Director  Member of OakTable Network
  • 3.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work  Fixing work distribution skew
  • 4.  Oracle Database Enterprise Edition includes the powerful Parallel Execution feature that allows spreading the processing of a single SQL statement execution across multiple worker processes  The feature is fully integrated into the Cost Based Optimizer as well as the execution runtime engine and automatically distributes the work across the so called Parallel Workers
  • 5.  Simple generic parallelization example Task: Compute sum of 8 numbers 1+8=9, 9+7=16, 16+9=25,... 1+8+7+9+6+2+6+3= ??? n=8 numbers, 7 computation steps required Serial execution: 7 time units
  • 6. Simple generic parallelization example 4 workers 3+ time units 1 + 8 = 9 9 + 7 = 16 6 + 2 = 8 6 + 3 = 9 9 + 16 = 25 8 + 9 = 17 25 + 17 = 42 Coordinator
  • 7. Parallel Execution doesn’t mean “work smarter” You’re actually willing to accept to “work harder” Could also be called: “Brute force” approach
  • 8. So with Parallel Execution there might be the problem that it doesn’t work “hard enough”
  • 9.  Two major challenges  Can the given task be divided into sub-tasks that can efficiently and independently be processed by the workers? (“Parallel Unfriendly”)  Can all assigned workers be kept busy all the time?
  • 10.  More reasons why Oracle Parallel Execution might not reduce runtime as expected:  Parallel DML/DDL gotchas  “Downgrade” at execution time (less workers assigned than expected)  Overhead of Parallel Execution implementation  Limitations of Parallel Execution implementation
  • 11. Parallel DML / DDL gotchas  DML / DDL part can run parallel or serial  Query part can run parallel or serial
  • 12. Parallel CTAS but serial query ------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | T4 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | Q1,01 | PCWP | | | 5 | PX SEND ROUND-ROBIN| :TQ10000 | | S->P | RND-ROBIN | |* 6 | HASH JOIN | | | | | | 7 | TABLE ACCESS FULL| T2 | | | | | 8 | TABLE ACCESS FULL| T2 | | | | -------------------------------------------------------------------------
  • 13. Serial CTAS but parallel query -------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | LOAD AS SELECT | T4 | | | | | 2 | PX COORDINATOR | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) | |* 4 | HASH JOIN BUFFERED | | Q1,02 | PCWP | | | 5 | PX RECEIVE | | Q1,02 | PCWP | | | 6 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH | | 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 8 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | | 9 | PX RECEIVE | | Q1,02 | PCWP | | | 10 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH | | 11 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | | 12 | TABLE ACCESS FULL| T2 | Q1,01 | PCWP | | --------------------------------------------------------------------------
  • 14.  Other reasons why Oracle Parallel Execution might not scale as expected:  Parallel DML/DDL gotchas  “Downgrade” at execution time (less workers assigned than expected)  Overhead of Parallel Execution implementation  Limitations of Parallel Execution implementation
  • 15. “Parallel Forced Serial” Example ------------------------------------------------------------------------------ | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | | 1 | PX COORDINATOR FORCED SERIAL| | | | | | 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) | | 3 | HASH UNIQUE | | Q1,03 | PCWP | | | 4 | PX RECEIVE | | Q1,03 | PCWP | | | 5 | PX SEND HASH | :TQ10002 | Q1,02 | P->P | HASH | |* 6 | HASH JOIN BUFFERED | | Q1,02 | PCWP | | | 7 | PX RECEIVE | | Q1,02 | PCWP | | | 8 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH | | 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 10 | TABLE ACCESS FULL | T2 | Q1,00 | PCWP | | | 11 | PX RECEIVE | | Q1,02 | PCWP | | | 12 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH | | 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | | 14 | TABLE ACCESS FULL | T2 | Q1,01 | PCWP | | ------------------------------------------------------------------------------
  • 16.  Two major challenges  Can the given task be divided into sub-tasks that can efficiently and independently be processed by the workers? (“Parallel Unfriendly”)  Can all assigned workers be kept busy all the time?
  • 17. select median(id) from t2; ----------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ----------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | 1 | SORT GROUP BY | | | | | | 2 | PX COORDINATOR | | | | | | 3 | PX SEND QC (RANDOM)| :TQ10000 | Q1,00 | P->S | QC (RAND) | | 4 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 5 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | -----------------------------------------------------------------------
  • 18. create table t3 parallel as select * from t2 where rownum <= 10000000; ----------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ----------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | | | 4 | PX RECEIVE | | Q2,01 | PCWP | | | 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN | |* 6 | COUNT STOPKEY | | | | | | 7 | PX COORDINATOR | | | | | | 8 | PX SEND QC (RANDOM) | :TQ10000 | Q1,00 | P->S | QC (RAND) | |* 9 | COUNT STOPKEY | | Q1,00 | PCWC | | | 10 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 11 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | -----------------------------------------------------------------------------
  • 19. create table t3 parallel as select * from (select a.*, lag(filler, 1) over (order by id) as prev_filler from t2 a); -------------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | | | 4 | PX RECEIVE | | Q2,01 | PCWP | | | 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN | | 6 | VIEW | | | | | | 7 | WINDOW BUFFER | | | | | | 8 | PX COORDINATOR | | | | | | 9 | PX SEND QC (ORDER) | :TQ10001 | Q1,01 | P->S | QC (ORDER) | | 10 | SORT ORDER BY | | Q1,01 | PCWP | | | 11 | PX RECEIVE | | Q1,01 | PCWP | | | 12 | PX SEND RANGE | :TQ10000 | Q1,00 | P->P | RANGE | | 13 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 14 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | --------------------------------------------------------------------------------
  • 20. All these examples have one thing in common: If the Query Coordinator (non-parallel part) needs to perform a significant part of the overall work, Parallel Execution won’t reduce the runtime as expected
  • 21.  Two major challenges  Can the given task be divided into sub-tasks that can efficiently and independently be processed by the workers? (“Parallel Unfriendly”)  Can all assigned workers be kept busy all the time?
  • 22.
  • 23.
  • 24.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work  Fixing work distribution skew
  • 25.  Extended SQL tracing  Generates one trace file per Parallel Worker process in database server trace directory  For cross-instance Parallel Execution this means files spread across more than one server  Each Parallel Worker trace file lists, among others, the number of rows produced per plan line and elapsed time
  • 26.  Extended SQL tracing Allows detecting data distribution skew Usually requires to reproduce the issue Tedious to collect and skim through dozens of trace files (no tool known that automates that job)
  • 27.  DBMS_XPLAN.DISPLAY_CURSOR + Rowsource statistics  Based on same internal code instrumentation as extended SQL trace  Feeds back rowsource statistics (rows produced on execution plan line level, timing, I/O stats) into Library Cache  Very useful for analyzing serial executions
  • 28.  DBMS_XPLAN.DISPLAY_CURSOR + Rowsource statistics Not enabled by default, need to reproduce Doesn’t cope with well multiple Parallel Execution Servers / more complex parallel plans or Cross Instance RAC execution No information per Parallel Execution Server => Therefore not able to detect distribution skew
  • 29.  Analyzing Data Distribution Skew  Oracle since a long time offers a special view called V$PQ_TQSTAT  It is populated after a successful Parallel Execution in the session of the Query Coordinator  It lists the amount of data send via the “Table Queues” / “Virtual Tables”  In theory this allows exact analysis which operations caused an uneven data distribution
  • 30.  V$PQ_TQSTAT skewed execution example DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES PROCESS ---------- ---------- ---------- ---------- ---------- ---------- 1 0 Producer 500807 54396692 P008 1 0 Producer 499193 54259853 P009 1 0 Consumer 500038 54332349 P010 1 0 Consumer 499962 54324196 P011 1 1 Producer 499826 57267724 P008 1 1 Producer 500174 57357253 P009 1 1 Consumer 499933 57304789 P010 1 1 Consumer 500067 57320188 P011 1 2 Producer 462069 52963939 P010 1 2 Producer 537931 61681312 P011 1 2 Consumer 500212 57346574 P008 1 2 Consumer 499788 57298677 P009 1 3 Producer 500038 57337041 P010 1 3 Producer 499962 57328456 P011 1 3 Consumer 0 48 P008 1 3 Consumer 1000000 114665449 P009 1 4 Producer 0 24 P008 1 4 Producer 1000000 116668401 P009 1 4 Consumer 1000000 116668425 QC
  • 31.  V$PQ_TQSTAT Allows detecting data distribution skew Can only be queried from inside session Doesn’t cope with well more complex parallel plans No information about relevance of skew
  • 32. Easy to identify whether all workers are kept busy all the time or not Easy to identify if there was a problem with work distribution Shows actual parallel degree used (“Parallel Downgrade”) Supports RAC
  • 33. Reports are not persisted and will be flushed from memory quite quickly on busy systems No easy identification and therefore no systematic troubleshooting which plan operations cause a work distribution problem Lacks some precision regarding Parallel Execution details
  • 34.  Real-Time SQL Monitoring allows detecting Parallel Execution skew in the following way:  In the “Activity” tab  The average active sessions will be less than the DOP of the operation, allows to detect both temporal and data distribution skew  In the “Parallel” tab  The DB Time recorded per Parallel Slave will show an uneven distribution of active work time
  • 35.  Real-Time SQL Monitoring “Activity” tab – skewed execution example
  • 36.  Real-Time SQL Monitoring “Parallel” tab – skewed execution example
  • 37.  One very useful approach is using Active Session History (ASH)  ASH samples active sessions once a second  Activity of Parallel Workers over time can easily be analyzed  From 11g on the ASH data even contains a reference to the execution plan line, so a relation between Parallel Worker activity and execution plan line based on ASH is possible
  • 38.  Custom queries on ASH data required for detailed analysis  XPLAN_ASH tool runs these queries for a given SQL_ID  Advantage of ASH is the availability of retained historic ASH data via AWR on disk  Information can be extracted even for SQL executions as long ago as the retention configured for AWR
  • 39.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work  Fixing work distribution skew
  • 40. Fixing Work Distribution Skew  Influence Parallel Distribution: Data volume estimates, PQ_DISTRIBUTE / PQ_[NO]MAP / PQ_SKEW (12c+) hint  Limit impact by influencing join order  Rewrite queries trying to limit or avoid skew  Remap skewed data changing the value distribution
  • 41. Fixing Work Distribution Skew  Automatic join skew detection in 12c (PQ_SKEW)  Supports only parallel HASH JOINs  Supports at present only simple probe row sources (no result of other joins supported)  Histogram showing popular values required (or forced via PQ_SKEW hint)
  • 42. ---------------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | 1 | SORT AGGREGATE | | | | | | 2 | PX COORDINATOR | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) | | 4 | SORT AGGREGATE | | Q1,02 | PCWP | | |* 5 | HASH JOIN | | Q1,02 | PCWP | | | 6 | PX RECEIVE | | Q1,02 | PCWP | | | 7 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | P->P | HYBRID HASH| | 8 | STATISTICS COLLECTOR | | Q1,00 | PCWC | | | 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | |* 10 | TABLE ACCESS FULL | T_1 | Q1,00 | PCWP | | | 11 | PX RECEIVE | | Q1,02 | PCWP | | | 12 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | Q1,01 | P->P | HYBRID HASH| | 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | |* 14 | TABLE ACCESS FULL | T_2 | Q1,01 | PCWP | | ----------------------------------------------------------------------------------
  • 43. Fixing Work Distribution Skew  More information:  http://oracle- randolf.blogspot.com/2014/08/parallel-execution- skew-summary.html
  • 44. Q & A

Editor's Notes

  1. Co-author of “Expert Oracle Practices” Oracle ACE Director: Acknowledged by Oracle for community contributions OakTable Network: Informal and independent group of people believing in a scientific approach towards Oracle
  2. Simple generic parallelization example Possibly additional startup cost: Find available /instruct / coordinate workers Major challenge: Divide task into chunks that can be efficiently and independently processed by workers Wall clock elapsed time for parallel execution can/should be lower than serial execution But potentially more worker units required than serial execution => work harder! Number of worker units assigned matters Too few can be bad Too many can be bad, too Communication between worker units required – data needs to be (re-) distributed (overhead!) Major challenge: Keep all workers busy all the time Parallelization might require different approach
  3. Parallel Execution can only reduce runtime as expected if all workers are kept busy Possibly only a few or a single worker will be active and have to do all the work In this case Parallel Execution can actually be slower than serial execution There is a need to measure how busy the workers are kept Note that this measure doesn’t tell you anything about the efficiency of the actual operation / execution plan But an otherwise efficient Parallel Execution plan can only scale if the expected number of workers is kept busy ideally all the time Note that it says “can scale” – if your system cannot scale the required resources (like I/O) you just end up with more workers waiting
  4. Ideally the data to be processed by the Parallel Execution Servers should be equally distributed, so that all slaves have the same amount of work, start around the same time and complete around the same time, which is illustrated by above image. The four Parallel Execution Servers P000 to P003 all spend similar amount of time processing the data and end around the same time.
  5. But if the data is not evenly distributed, which can happen with almost all kinds of distribution methods (most commonly HASH or RANGE distributions) then a picture like the one above could result. P002 gets assigned a lot more data than the other slaves and hence takes much longer to complete. In the meanwhile the other slaves are done for quite a while with their assigned data and stay idle as the overall operations in many cases can only progress after all slaves are done. This means that the duration of such an operation is determined by the slave taking the longest, and during that period the effective degree of parallelism is potentially far lower than in the ideal case. In this example here for a significant amount of time only P002 will be active, so only a single process instead of the four assigned. More information: https://www.youtube.com/watch?v=vJUE1gz8Cgk
  6. If V$PQ_TQSTAT isn’t that useful how can be measure it then?
  7. If V$PQ_TQSTAT isn’t that useful how can be measure it then?
  8. If V$PQ_TQSTAT isn’t that useful how can be measure it then?
  9. If V$PQ_TQSTAT isn’t that useful how can be measure it then?
  10. One of the challenges of Parallel Execution is the data distribution – if we don’t distribute the work well among the Parallel Slaves, we won’t scale. How can we measure the data distribution (but not necessarily work distribution)? Straightforward using the special view V$PQ_TQSTAT which shows the amount of data distributed for each re-distribution. V$PQ_TQSTAT drawbacks Only populated in the Query Coordinator session, cannot be accessed from outside Only populated after successful completion of queries, cancelling a long-running query or queries failing due to errors (for example, out of TEMP space) prevents the population of the view More complex execution plans using Parallel DML / DDL (for example Parallel Insert or CTAS) and / or involving multiple DFOs often result in an incomplete / useless population of the view
  11. For Table Queue ID = 3, the data was produced uniformly but all re-directed to the same Parallel Slave. You can find the corresponding operation in the execution plan.
  12. If V$PQ_TQSTAT isn’t that useful how can be measure it then?
  13. However, Real-Time SQL Monitoring has the following shortcomings: Monitoring information is not persisted and on busy systems might be flushed from memory pretty quickly (memory size configurable via undocumented parameter) It doesn’t monitor complex execution plans (> 300 lines, configurable via undocumented parameter) It doesn’t indicate which operations of the execution plan caused the skew Besides that it requires Tuning Pack license Funnily, at least the row distribution per execution plan line operation is available But not from the report generated, only by directly querying GV$SQL_PLAN_MONITOR Raw data is not part of the generated report You need to run your own custom query