Tracing Parallel Execution (UKOUG 2006)
Upcoming SlideShare
Loading in...5
×
 

Tracing Parallel Execution (UKOUG 2006)

on

  • 5,342 views

Some of the quirks of tracing the activities of a task which uses PX

Some of the quirks of tracing the activities of a task which uses PX

Statistics

Views

Total Views
5,342
Views on SlideShare
5,338
Embed Views
4

Actions

Likes
2
Downloads
170
Comments
0

1 Embed 4

http://www.slideshare.net 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Tracing Parallel Execution (UKOUG 2006) Tracing Parallel Execution (UKOUG 2006) Presentation Transcript

  • Tracing PX Slaves Doug Burns [email_address] http://oracledoug.com
  • Agenda
    • Introduction
    • Test Environment
    • Tracing Parallel Execution
    • Trace Output
    • Consolidation
    • Timing Issues
    • Conclusion
  • Introduction
    • Who (or what) am I ?
      • Scottish
      • Predominantly a DBA
      • Training and Consultancy
    • Current Assignment
      • BSkyB
      • Very Cool Projects and Hardware
      • Less Cool Release Management
    • http://oracledoug.com
      • Blog
  • Introduction
    • Previous Related Papers
      • Suck It Dry - Tuning Parallel Execution
      • How Many Slaves? – PX and the Magic of 2
      • Available at http://oracledoug.com
    • Objectives
      • Answer tracing questions that cropped up
      • Show the detail
    • Why Parallel Execution?
      • Increasing Volumes of Data
      • Increasing User Expectations
      • More Powerful Hardware
    • Parallel Execution (PX) splits a single large task into multiple smaller tasks which are handled by separate processes running concurrently.
      • Full Table Scans
      • Sorts
      • Index Creation, Direct Path inserts etc …
    Introduction
  • Test Environment
    • Example
      • Small Table
        • Easier to wade through trace files
        • Quick execution
        • Not representative of real applications
        • Force PX using Hints
      • TEST_TAB
        • 2,048,000 rows
        • Average row length – 109 bytes
        • 8Kb Blocks/ 32M Extents / 256 Mb Segment
        • No Indexes
  • Test Environment
    • Oracle 9i Release 2
      • The most common database at work
      • Tracing similar to earlier versions
    • Oracle 10g Release 2
      • Principles are the same
      • Improved tracing functionality
    • Single CPU / Single HDD
      • Trying to introduce wait events
  • Test Environment
    • The Query
          • SELECT /*+ parallel(tt1, 2) */
          • MOD(tt1.pk_id, 2), COUNT(*)
          • FROM test_tab tt1
          • GROUP BY MOD(tt1.pk_id,2)
          • ORDER BY MOD(tt1.pk_id,2);
      • Degree of Parallelism (DOP) = 2
      • Two Sets of PX Slaves Requested
        • One set for Full Table Scan
        • One set for Grouping Operation
  • Tracing Parallel Execution
    • Enable Tracing
      • 10g
      • alter session set tracefile_identifier=‘doug_test';
      • exec dbms_session.set_identifier(‘doug_test');
      • exec dbms_monitor.client_id_trace_enable(
      • client_id => ‘doug_test');
      • 9i
      • alter session set tracefile_identifier=‘doug_test';
      • alter session set events '10046 trace name context forever, level 8';
      • tracefile_identifier not as useful for PX
        • Only applies to the Query Co-ordinator (QC) process
        • Makes sense as the PX slave doesn’t ‘belong’ to the session
  • Tracing Parallel Execution
    • Run Query
    • Disable Tracing
      • 10g
      • exec dbms_monitor.client_id_trace_disable(
      • client_id => ‘doug_test');
      • 9i
      • alter session set events '10046 trace name context off';
    • Several Trace files produced
      • Remember – this is for the lowest DOP of 2
  • Tracing Parallel Execution
  • Tracing Parallel Execution
    • user_dump_dest
        • test1020_ora_17239_ doug_test .trc
    • background_dump_dest
        • test1020_ p000 _17219.trc
        • test1020_ p001 _17221.trc
        • test1020_ p002 _17223.trc
        • test1020_ p003 _17225.trc
      • No mention of tracefile_identifier
      • Slave numbers could be different
  • Tracing Parallel Execution
    • Issues
      • Difficult to get a consolidated view of the user action
      • PX Slaves are shared background processes
        • How to ensure that you’re looking at the correct slaves?
        • How to ensure that you don’t include unrelated activity?
      • 9i - With a little difficulty and rm!
      • 10g – dbms_monitor and trcsess
  • Trace Output – 9i
    • Query Co-ordinator (QC)
      • Unix process pid: 5790, image: oracle@ISP4400.localdomain (TNS V1-V3)
      • *** SESSION ID:(8.283) 2006-09-12 10:54:33.245
      • APPNAME mod='SQL*Plus' mh=3669949024 act='' ah=4029777240
  • Trace Output – 9i
    • Followed by tons of recursive calls
      • Enabling Tracing, CBO work, etc.
      • PARSING IN CURSOR #1 len=130 dep=0 uid=23 oct=3 lid=23 tim=1130912963021298 hv=3017045374 ad='959ffae8'
      • SELECT /*+ parallel(tt1, 2) */
      • MOD(tt1.pk_id, 2), COUNT(*)
      • FROM test_tab1 tt1
      • GROUP BY MOD(tt1.pk_id,2)
      • ORDER BY MOD(tt1.pk_id,2)
      • END OF STMT
      • PARSE #1:c=10998,e=12643,p=0,cr=7,cu=3,mis=1,r=0,dep=0,og=4,tim=1130912963021287
      • WAIT #1: nam='enqueue' ela= 2998428 p1=1413677062 p2=65544 p3=0
  • Trace Output – 9i (QC Contd)
    • Now to set up the slave sets
      • WAIT #1: nam='rdbms ipc reply' ela= 122 p1=5 p2=21474836 p3=0
      • WAIT #1: nam= 'process startup ' ela= 13 p1=80 p2=1 p3=0
      • WAIT #1: nam='process startup' ela= 34572 p1=80 p2=1 p3=1
      • WAIT #1: nam=' PX Deq: Join ACK ' ela= 1983 p1=268500992 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Join ACK' ela= 1479 p1=268500993 p2=1 p3=0
      • WAIT #1: nam='process startup' ela= 19432 p1=80 p2=2 p3=0
      • WAIT #1: nam='process startup' ela= 26803 p1=80 p2=3 p3=0
      • WAIT #1: nam='PX Deq: Join ACK' ela= 2930 p1=268500994 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Join ACK' ela= 1557 p1=268500995 p2=1 p3=0
  • Trace Output – 9i (QC Contd)
    • Set up queues and wait for slaves to parse statements
      • WAIT #1: nam=' PX qref latch ' ela= 4 p1=1 p2=1 p3=-1821541616
      • WAIT #1: nam='PX qref latch' ela= 10385 p1=1 p2=2 p3=-1821541616
      • WAIT #1: nam='PX qref latch' ela= 1182 p1=1 p2=3 p3=-1821541616
      • WAIT #1: nam='PX qref latch' ela= 29903 p1=1 p2=4 p3=-1821541616
      • WAIT #1: nam=' PX Deq: Parse Reply ' ela= 3277 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Parse Reply' ela= 1992474 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Parse Reply' ela= 960495 p1=200 p2=2 p3=0
  • Trace Output – 9i (QC Contd)
    • Start waiting for responses and take out PX qref latch when communicating
      • WAIT #1: nam=' PX Deq: Execute Reply ' ela= 210184 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Execute Reply' ela= 42955 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Execute Reply' ela= 1636370 p1=200 p2=1 p3=0
      • WAIT #1: nam=' PX qref latch ' ela= 11128 p1=1 p2=1 p3=-1821543064
      • WAIT #1: nam='PX Deq: Execute Reply' ela= 501343 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX qref latch ' ela= 11011 p1=1 p2=1 p3=-1821541616
      • WAIT #1: nam='PX Deq: Execute Reply' ela= 1424051 p1=200 p2=1 p3=0
  • Trace Output – 9i (QC Contd)
    • Some wrap-up towards the end
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 2 p1=0 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 20 p1=10 p2=2 p3=0
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 98523 p1=10 p2=3 p3=0
      • FETCH #1:c=1000,e=98953,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=1130913033637953
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 2351 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 20 p1=200 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 200 p1=200 p2=2 p3=0
      • WAIT #1: nam='PX Deq: Signal ACK' ela= 1031 p1=200 p2=1 p3=0
  • Trace Output – 9i
    • Slaves 0 and 1 – Sort/Group
    • Instance name: TEST92
      • Redo thread mounted by this instance: 1
      • Oracle process number: 9
      • Unix process pid: 5794, image: oracle@ISP4400.localdomain (P000)
      • *** SESSION ID:(13.34) 2006-09-12 10:55:19.741
      • <<snipped>
      • PARSING IN CURSOR #1 len=98 dep=0 uid=23 oct=3 lid=23 tim=1130913007567950 hv=4220356981 ad='959d6c34'
      • SELECT /*+ CIV_GB */ A1.C0,COUNT(SYS_OP_CSR(A1.C1,0)) FROM :Q3000 A1 GROUP BY A1.C0 ORDER BY A1.C0
      • END OF STMT
      • PARSE #1:c=11998,e=14410,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=4,tim=1130913007567939
  • Trace Output – 9i (Sort Contd)
    • Cycle round waiting for data to come back from other slave set
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 3 p1=268566527 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1992813 p1=268566527 p2=2 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1992862 p1=268566527 p2=3 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1993857 p1=268566527 p2=4 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1993834 p1=268566527 p2=5 p3=0
  • Trace Output – 9i (Sort Contd)
    • EXEC #1:c=0,e=248,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1130913033435536
      • WAIT #1: nam='PX Deq: Table Q Sample' ela= 98589 p1=10 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Table Q Sample' ela= 2042 p1=200 p2=2 p3=0
      • FETCH #1:c=1000,e=101033,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1130913033536644
      • WAIT #1: nam= 'PX Deq Credit: send blkd ' ela= 511 p1=268566527 p2=1 p3=0
      • FETCH #1:c=0,e=33,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=1130913033537700
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1543 p1=268566527 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 98633 p1=268566527 p2=1 p3=0
  • Trace Output – 9i
    • Slaves 2 and 3 – FTS
      • PARSING IN CURSOR #1 len=208 dep=0 uid=23 oct=3 lid=23 tim=1130913007579832 hv=721990200 ad='959d0330'
      • SELECT /*+ PIV_GB */ MOD(A1.C0,2) C0,SYS_OP_MSR(COUNT(*)) C1 FROM (SELECT /*+ NO_EXPAND ROWID(A2) */ A2.&quot;PK_ID&quot; C0 FROM &quot;TESTUSER&quot;.&quot;TEST_TAB1&quot; PX_GRANULE(0, BLOCK_RANGE, DYNAMIC) A2) A1 GROUP BY MOD(A1.C0,2)
      • END OF STMT
      • PARSE #1:c=9999,e=9226,p=0,cr=1,cu=0,mis=1,r=0,dep=0,og=4,tim=1130913007579821
  • Trace Output – 9i FTS (Contd)
    • Go out and read data (with Direct Path Reads)
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1992624 p1=268566527 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 994085 p1=268566527 p2=2 p3=0
      • EXEC #1:c=0,e=180,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1130913010567272
      • WAIT #1: nam= 'direct path read ' ela= 50 p1=5 p2=10 p3=128
      • WAIT #1: nam='direct path read' ela= 14 p1=5 p2=138 p3=128
      • WAIT #1: nam='direct path read' ela= 17 p1=5 p2=266 p3=128
      • <snipped>
      • WAIT #1: nam='direct path read' ela= 32 p1=5 p2=522 p3=128
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 11253 p1=268566527 p2=1 p3=0
  • Trace Output – 9i FTS (Contd)
    • Wrap-up
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 10387 p1=268566527 p2=1 p3=0
      • FETCH #1:c=11410266,e=22810016,p=16263,cr=16486,cu=0,mis=0,r=0,dep=0,og=4,tim=1130913033377363
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 59137 p1=268566527 p2=1 p3=0
      • FETCH #1:c=0,e=86,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=1130913033436946
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 20 p1=268566527 p2=1 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 389 p1=268566527 p2=2 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 101813 p1=268566527 p2=3 p3=0
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 100783 p1=268566527 p2=1 p3=0
  • Trace Output – 10g
    • Query Co-ordinator (QC)
    • Unix process pid: 17239 , image: oracle@ISP4400.localdomain (TNS V1-V3)
    • *** ACTION NAME:() 2006-08-20 10:59:33.605
    • *** MODULE NAME:(SQL*Plus) 2006-08-20 10:59:33.605
    • *** SERVICE NAME:(SYS$USERS) 2006-08-20 10:59:33.605
    • *** CLIENT ID:(doug_test) 2006-08-20 10:59:33.605
    • *** SESSION ID:( 1095.5 ) 2006-08-20 10:59:33.605
    • Followed by tons of recursive calls
      • Enabling Tracing, CBO work, etc.
  • Trace Output – 10g (QC Contd)
    • Essentially the same
    • PARSING IN CURSOR #4 len=129 dep=0 uid=27 oct=3 lid=27 tim=1128972630780520 hv=3411843401 ad='77e67d54'
    • SELECT /*+ parallel(tt1, 2) */ MOD(tt1.pk_id, 2), COUNT(*)
    • FROM test_tab1 tt1
    • GROUP BY MOD(tt1.pk_id,2)
    • ORDER BY MOD(tt1.pk_id,2)
    • END OF STMT
    • PARSE #4:c=267960,e=270534,p=20,cr=447,cu=3,mis=1,r=0,dep=0,og=1,tim=1128972630780508
  • Trace Output – 10g (QC Contd)
    • Note the different timed event format
    • WAIT #4: nam='PX Deq: Execute Reply' ela= 31 sleeptime/senderid=200 passes=1 p3=0 obj#=10907 tim=1129017680317433
    • WAIT #4: nam='PX Deq: Execute Reply' ela= 282420 sleeptime/senderid=200 passes=2 p3=0 obj#=10907 tim=1129017680599950
    • WAIT #4: nam='PX qref latch' ela= 57346 function=1 sleeptime=2042549608 qref=0 obj#=10907 tim=1129017680657526
    • WAIT #4: nam='PX Deq: Execute Reply' ela= 916249 sleeptime/senderid=200 passes=2 p3=0 obj#=10907 tim=1129017681574125
    • WAIT #4: nam='PX qref latch' ela= 1594 function=1 sleeptime=2042549608 qref=0 obj#=10907 tim=1129017681575903
  • Trace Output – 10g
    • Slaves 0 and 1 – Sort / Group
      • Instance name: TEST1020
      • Redo thread mounted by this instance: 1
      • Oracle process number: 12
      • Unix process pid: 17219 , image: oracle@ISP4400.localdomain ( P000 )
      • *** 2006-08-20 10:59:33.938
      • *** SERVICE NAME:(SYS$USERS) 2006-08-20 10:59:33.938
      • *** CLIENT ID:(doug_test) 2006-08-20 10:59:33.938
      • *** SESSION ID:(1090.11) 2006-08-20 10:59:33.938
      • WAIT #0: nam='PX Deq: Msg Fragment' ela= 1611 sleeptime/senderid=268566527 passes=1 p3=0 obj#=-1 tim=1128972630798951
  • Trace Output – 10g (Sort contd)
    • Change to a shared cursor model
      • PARSING IN CURSOR #1 len=129 dep=1 uid=27 oct=3 lid=27 tim=1129017679619468 hv=3411843401 ad='77e66bd4'
      • SELECT /*+ parallel(tt1, 2) */ MOD(tt1.pk_id, 2), COUNT(*)
      • FROM test_tab1 tt1
      • GROUP BY MOD(tt1.pk_id,2)
      • ORDER BY MOD(tt1.pk_id,2)
      • END OF STMT
      • PARSE #1:c=1000,e=731,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1129017679619454
      • WAIT #1: nam='PX qref latch' ela= 52088 function=1 sleeptime=2042551468 qref=0 obj#=-1 tim=1129017679672279
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1954088 sleeptime/senderid=268566527 passes=1 p3=0 obj#=-1 tim=1129017681626630
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 1954772 sleeptime/senderid=268566527 passes=2 p3=0 obj#=-1 tim=1129017683581534
  • Trace Output – 10g
    • Slaves 2 and 3 - FTS
      • Instance name: TEST1020
      • Redo thread mounted by this instance: 1
      • Oracle process number: 12
      • Unix process pid: 17219 , image: oracle@ISP4400.localdomain ( P000 )
      • *** 2006-08-20 10:59:33.938
      • *** SERVICE NAME:(SYS$USERS) 2006-08-20 10:59:33.938
      • *** CLIENT ID:( doug_test ) 2006-08-20 10:59:33.938
      • *** SESSION ID:(1090.11) 2006-08-20 10:59:33.938
      • WAIT #0: nam='PX Deq: Msg Fragment' ela= 1611 sleeptime/senderid=268566527 passes=1 p3=0 obj#=-1 tim=1128972630798951
  • Trace Output – 10g (FTS Contd)
      • PARSING IN CURSOR #1 len=129 dep=1 uid=27 oct=3 lid=27 tim=1129017679653431 hv=3411843401 ad='77e66bd4'
      • SELECT /*+ parallel(tt1, 2) */ MOD(tt1.pk_id, 2), COUNT(*)
      • FROM test_tab1 tt1
      • GROUP BY MOD(tt1.pk_id,2)
      • ORDER BY MOD(tt1.pk_id,2)
      • END OF STMT
      • PARSE #1:c=1000,e=649,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1129017679653417
      • WAIT #1: nam='PX qref latch' ela= 18705 function=1 sleeptime=2042549980 qref=0 obj#=-1 tim=1129017679672803
      • WAIT #1: nam='PX Deq: Execution Msg' ela= 500 sleeptime/senderid=268566527 passes=1 p3=0 obj#=-1 tim=1129017679673500
      • WAIT #1: nam='direct path read' ela= 46 file number=6 first dba=10 block cnt=118 obj#=10907 tim=1129017679688488
      • WAIT #1: nam='PX qref latch' ela= 11838 function=1 sleeptime=2042549980 qref=0 obj#=10907 tim=1129017679700708
  • Consolidation
    • trcsess utility (10g)
      • Consolidates all trace files associated with a specific session identifier
      • Consolidated trace file looks like any standard trace file
      • Can be processed using tkprof to produce statement-level summary information
      • trcsess output=output_file clientid=session_identifier /oracle/admin/TEST1020/udump/*.trc /oracle/admin/TEST1020/bdump/*.trc
      • tkprof trcsess_output_file output_file sort=prsela,fchela,exeela
      • Output is in cursor order, not strict time order
  • Timing Issues
    • Example Timings
      • Wall Clock elapsed time 13.56 seconds
        • call count cpu elapsed
        • ------- ------ -------- ----------
        • Parse 5 0.01 0.02
        • Execute 5 12.98 51.57
        • Fetch 2 0.01 12.81
        • ------- ------ -------- ----------
        • total 12 13.01 64.41
      • Picking up the wrong trace files?
      • trcsess bug?
  • Timing Issues
    • Wait Time also ‘too high’
    • Event waited on Times Max. Wait Total Waited
    • ---------------------------------------- Waited ---------- ------------
    • PX qref latch 31 0.09 0.76
    • PX Deq: Execution Msg 49 1.95 27.76
    • direct path read 281 0.00 0.00
    • db file sequential read 2 0.00 0.00
    • PX Deq: Parse Reply 2 0.00 0.00
    • SQL*Net message to client 2 0.00 0.00
    • PX Deq: Execute Reply 39 0.92 12.15
    • PX Deq: Table Q qref 1 0.00 0.00
    • SQL*Net message from client 2 0.00 0.00
    • PX Deq: Signal ACK 6 0.00 0.01
    • enq: PS - contention 1 0.00 0.00
    • PX Deq: Table Q Sample 2 0.00 0.00
    • PX Deq Credit: send blkd 3 0.00 0.00
  • Timing Issues
    • This is a consolidated trace file
      • Multiple concurrent processes
      • Some running on multiple CPUs
        • There is more CPU time available than wall clock time
      • Some just waiting
        • Systems have an infinite capacity for waiting
      • Will vary, depending on the actual DOP acquired
        • The DOP acquired could vary
        • How many slaves are available in the pool?
        • parallel_adaptive_multi_user
  • Timing Issues
    • Exactly the same job, but DOP 16
      • Wall Clock elapsed time = 16.60 seconds
        • Slightly higher than 13.56 seconds
        • call count cpu elapsed
        • ------- ------ -------- ----------
        • Parse 33 0.03 0.16
        • Execute 33 13.42 430.24
        • Fetch 2 0.02 12.28
        • ------- ------ -------- ----------
        • total 68 13.48 442.69
      • Much higher elapsed time
      • Couple of seconds additional wait time
  • Conclusion
    • Tracing PX is a little more tricky that tracing serial processes
      • A bit like trying to trace multiple users, or an entire server?
    • 10g dbms_monitor package and trcsess utility help
      • Consolidated timings need to be interpreted differently
    • More papers available at http://oracledoug.com
    • Next presentation – How Many Slaves – Hall 7b at 16:00
  • Tracing PX Slaves Doug Burns [email_address] http://oracledoug.com