0
How Many Slaves? Parallel Execution and the Magic of 2 Doug Burns [email_address] http://oracledoug.com
Introduction <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul>...
Introduction <ul><li>Who (or what) am I ? </li></ul><ul><ul><li>Scottish </li></ul></ul><ul><ul><li>Predominantly a DBA </...
<ul><li>Why Parallel Execution? </li></ul><ul><ul><li>Increasing Volumes of Data </li></ul></ul><ul><ul><li>Increasing Use...
<ul><li>Previous Paper </li></ul><ul><ul><li>Suck It Dry – Tuning Parallel Execution </li></ul></ul><ul><ul><li>http://ora...
<ul><li>Always set customer expectation levels </li></ul><ul><ul><li>I hope you didn’t come here looking for answers! </li...
Introduction <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul>...
<ul><li>Batch Queue Management and the Magic of ‘2’  </li></ul><ul><ul><li>Cary Millsap (2000) - available at hotsos.com <...
What is The Magic of ‘2’? <ul><li>Oracle 10.2 Docs mention the Magic of ‘2’ </li></ul><ul><li>PARALLEL_THREADS_PER_CPU  en...
What Tests? <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><...
<ul><li>What should I test? </li></ul><ul><ul><li>Parallel operations (obviously) </li></ul></ul><ul><ul><li>Multiple CPUs...
<ul><li>First attempt </li></ul><ul><ul><li>Full Table scan of a 2 million row table </li></ul></ul><ul><ul><ul><li>PCTFRE...
<ul><li>Third attempt </li></ul><ul><ul><li>FTS plus a Hash Join and Sort of two 8 million row tables </li></ul></ul><ul><...
What Tests? <ul><li>The Test Process will be much easier if you have </li></ul><ul><ul><li>Enough Time </li></ul></ul><ul>...
<ul><li>Intel Single-CPU PC –  Tulip PC </li></ul><ul><ul><li>White Box Linux – Kernel 2.6.9 </li></ul></ul><ul><ul><li>1 ...
<ul><li>Enterprise SMP server –  Sun E10K </li></ul><ul><ul><li>Solaris 8 </li></ul></ul><ul><ul><li>12 x 400Mhz SPARC </l...
Test Scripts and Tools <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li...
Test Scripts and Tools <ul><li>init.ora  </li></ul><ul><ul><li>Disabled parallel_adaptive_multi_user </li></ul></ul><ul><u...
Test Scripts and Tools <ul><li>Test scripts </li></ul><ul><ul><li>To run selected SQL statements consistently across a ran...
Test Scripts and Tools <ul><li>Information Collection </li></ul><ul><ul><li>Simple log file </li></ul></ul><ul><ul><ul><li...
Test Scripts and Tools <ul><li>Operating System Statistics </li></ul><ul><ul><li>Resource Usage </li></ul></ul><ul><ul><li...
Test Results <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul>...
PC – 1 CPU – 1.3Gb
ISP4400 – 1-4 CPUs – FTS 11Gb
ISP4400 – 1-4 CPUs – HJ 22Gb
E10K – 1-12 CPUs – FTS 11Gb
E10K – 1-12 CPUs – HJ 22Gb
Multi-user Tests <ul><li>First attempt  </li></ul><ul><ul><li>Hash Join/Sort statement only </li></ul></ul><ul><ul><li>170...
Multi-user Tests
Doh! <ul><li>If the CPUs weren’t working hard enough on the multi-user tests, then … </li></ul><ul><li>I should re-run the...
Single User Volume Tests II
When is a Conclusion … <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li...
<ul><li>…  not a Conclusion? </li></ul><ul><ul><li>When it contains lots of mights, maybes and coulds? </li></ul></ul><ul>...
<ul><li>The only way to know for  sure  is to test  your  SQL, with  your  data with a range of DOPs </li></ul><ul><ul><li...
<ul><li>More things to try </li></ul><ul><ul><li>Bigger stripe widths and filesystem options ( DONE ) </li></ul></ul><ul><...
<ul><li>The scripts are there </li></ul><ul><ul><li>http://oracledoug.com/px_slaves.doc </li></ul></ul><ul><ul><li>Tailor ...
How Many Slaves? Parallel Execution and the Magic of 2 Doug Burns [email_address] http://oracledoug.com
Upcoming SlideShare
Loading in...5
×

How Many Slaves (Ukoug)

1,711

Published on

UKOUG version of a presentation trying to establish the sensible limits of parallelism on a couple of hardware configurations. Detailed white paper is at http://oracledoug.com/px_slaves.pdf

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,711
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
69
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Transcript of "How Many Slaves (Ukoug)"

    1. 1. How Many Slaves? Parallel Execution and the Magic of 2 Doug Burns [email_address] http://oracledoug.com
    2. 2. Introduction <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><li>Test Scripts and Tools </li></ul><ul><li>Test Results </li></ul><ul><li>When is a Conclusion … </li></ul>
    3. 3. Introduction <ul><li>Who (or what) am I ? </li></ul><ul><ul><li>Scottish </li></ul></ul><ul><ul><li>Predominantly a DBA </li></ul></ul><ul><ul><li>Training and Consultancy </li></ul></ul><ul><li>Current Assignment </li></ul><ul><ul><li>BSkyB </li></ul></ul><ul><ul><li>Very Cool Projects and Hardware </li></ul></ul><ul><ul><li>Less Cool Release Management </li></ul></ul><ul><li>http://oracledoug.com </li></ul><ul><ul><li>Blog </li></ul></ul>
    4. 4. <ul><li>Why Parallel Execution? </li></ul><ul><ul><li>Increasing Volumes of Data </li></ul></ul><ul><ul><li>Increasing User Expectations </li></ul></ul><ul><ul><li>More Powerful Hardware </li></ul></ul><ul><li>Parallel Execution (PX) splits a single large task into multiple smaller tasks which are handled by separate processes running concurrently. </li></ul><ul><ul><li>Full Table Scans </li></ul></ul><ul><ul><li>Sorts </li></ul></ul><ul><ul><li>Index Creation, Direct Path inserts etc … </li></ul></ul>Introduction
    5. 5. <ul><li>Previous Paper </li></ul><ul><ul><li>Suck It Dry – Tuning Parallel Execution </li></ul></ul><ul><ul><li>http://oracledoug.com/px.html (.doc & .pdf) </li></ul></ul><ul><ul><li>Reviewer comments on parallel_max_servers </li></ul></ul><ul><ul><li>Debate about parallel_adaptive_multi_user </li></ul></ul><ul><ul><li>Something about the Magic of ‘2’ </li></ul></ul><ul><ul><li>Talked about Hardware, but nothing specific </li></ul></ul><ul><li>‘ Sometimes when faced with a slow i/o subsystem you might find that higher degrees of parallelism are useful because the CPUs are spending more time waiting for i/o to complete’ </li></ul>Introduction
    6. 6. <ul><li>Always set customer expectation levels </li></ul><ul><ul><li>I hope you didn’t come here looking for answers! </li></ul></ul><ul><ul><li>Or lots of detail </li></ul></ul><ul><li>http://oracledoug.com/px_slaves.pdf (or.doc) </li></ul><ul><ul><li>An interesting story, nonetheless </li></ul></ul><ul><ul><li>A framework for your own tests </li></ul></ul><ul><ul><li>A glance at some results </li></ul></ul>Introduction
    7. 7. Introduction <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><li>Test Scripts and Tools </li></ul><ul><li>Test Results </li></ul><ul><li>When is a Conclusion … </li></ul>
    8. 8. <ul><li>Batch Queue Management and the Magic of ‘2’ </li></ul><ul><ul><li>Cary Millsap (2000) - available at hotsos.com </li></ul></ul><ul><li>How many batch processes to execute per CPU? </li></ul><ul><ul><li>2. Well, a range of values really, between 1 and 1.8? </li></ul></ul><ul><ul><li>Most recent work expands on this </li></ul></ul><ul><ul><ul><li>CPU-intensive batch jobs per CPU <2 (nearer to 1) </li></ul></ul></ul><ul><ul><ul><li>I/O-intensive batch jobs per CPU >2 </li></ul></ul></ul><ul><ul><ul><li>CPU and I/O request durations are exactly equal (rare) - CPU * 2 </li></ul></ul></ul><ul><ul><li>Misconfiguration could change everything </li></ul></ul><ul><ul><li>What is a batch job anyway? </li></ul></ul>What is The Magic of ‘2’?
    9. 9. What is The Magic of ‘2’? <ul><li>Oracle 10.2 Docs mention the Magic of ‘2’ </li></ul><ul><li>PARALLEL_THREADS_PER_CPU enables you to adjust for hardware configurations with I/O subsystems that are slow relative to the CPU speed and for application workloads that perform few computations relative to the amount of data involved . </li></ul><ul><li>If the system is neither CPU-bound nor I/O-bound, then the PARALLEL_THREADS_PER_CPU value should be increased. This increases the default DOP and allow better utilization of hardware resources. </li></ul><ul><li>The default for PARALLEL_THREADS_PER_CPU on most platforms is two . However, the default for machines with relatively slow I/O subsystems can be as high as eight . </li></ul>
    10. 10. What Tests? <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><li>Test Scripts and Tools </li></ul><ul><li>Test Results </li></ul><ul><li>When is a Conclusion … </li></ul>
    11. 11. <ul><li>What should I test? </li></ul><ul><ul><li>Parallel operations (obviously) </li></ul></ul><ul><ul><li>Multiple CPUs </li></ul></ul><ul><ul><li>I/O infrastructure </li></ul></ul><ul><li>Operating System – Unix / Linux </li></ul><ul><ul><li>Free (as in beer) </li></ul></ul><ul><ul><li>Cross-platform </li></ul></ul><ul><ul><li>Tools and Utilities </li></ul></ul><ul><li>Oracle Version – 10.2 </li></ul><ul><ul><li>The latest and greatest, or common and well-known? </li></ul></ul><ul><ul><li>Boy, that was a good choice. </li></ul></ul><ul><li>Workloads – Keep it simple </li></ul><ul><ul><li>Data! </li></ul></ul><ul><ul><li>CPU vs I/O balance </li></ul></ul>What Tests?
    12. 12. <ul><li>First attempt </li></ul><ul><ul><li>Full Table scan of a 2 million row table </li></ul></ul><ul><ul><ul><li>PCTFREE 90 expanded it to 2.8Gb </li></ul></ul></ul><ul><ul><ul><li>Small enough for all platforms </li></ul></ul></ul><ul><ul><ul><li>Big enough to exercise the I/O subsystem properly </li></ul></ul></ul><ul><ul><ul><li>NOT! EMC took 7 seconds. </li></ul></ul></ul><ul><li>Second attempt </li></ul><ul><ul><li>Full Table scan of 8 million row table </li></ul></ul><ul><ul><ul><li>PCTFREE 90 expanded it to 10Gb </li></ul></ul></ul><ul><ul><ul><li>Too big for the little PC now! (Used 1/8 of the data) </li></ul></ul></ul><ul><ul><ul><li>Solved most problems </li></ul></ul></ul><ul><ul><ul><li>But too I/O intensive (More on this later) </li></ul></ul></ul>What Tests?
    13. 13. <ul><li>Third attempt </li></ul><ul><ul><li>FTS plus a Hash Join and Sort of two 8 million row tables </li></ul></ul><ul><ul><ul><li>PCTFREE 90 expanded them to over 10Gb </li></ul></ul></ul><ul><ul><ul><li>Unsuitable for the PC, used 1/8 data again </li></ul></ul></ul><ul><ul><ul><li>Started to produce more interesting results </li></ul></ul></ul><ul><li>Multi-user tests </li></ul><ul><ul><li>More on these later </li></ul></ul><ul><ul><ul><li>8 new 1 million row tables </li></ul></ul></ul><ul><ul><ul><li>PCTFREE 90 expanded them to 147Mb each </li></ul></ul></ul>What Tests?
    14. 14. What Tests? <ul><li>The Test Process will be much easier if you have </li></ul><ul><ul><li>Enough Time </li></ul></ul><ul><ul><li>Appropriate Hardware </li></ul></ul><ul><ul><li>A Dedicated Assistant </li></ul></ul><ul><ul><li>A Pleasant Working Environment </li></ul></ul><ul><li>Two out of Four ain’t bad … </li></ul>
    15. 15. <ul><li>Intel Single-CPU PC – Tulip PC </li></ul><ul><ul><li>White Box Linux – Kernel 2.6.9 </li></ul></ul><ul><ul><li>1 x 550Mhz Pentium 3 </li></ul></ul><ul><ul><li>768Mb RAM </li></ul></ul><ul><ul><li>Single 20Gb IDE </li></ul></ul><ul><li>Intel SMP Server – Intel ISP4400 (SRKA4) </li></ul><ul><ul><li>White Box Linux – Kernel 2.6.9 </li></ul></ul><ul><ul><li>4 x 700Mhz Pentium 3 Xeon </li></ul></ul><ul><ul><li>3.5Gb RAM </li></ul></ul><ul><ul><li>4 x Seagate Cheetah U-160 SCSI </li></ul></ul><ul><ul><ul><li>Software RAID-0 (256Kb stripe) </li></ul></ul></ul><ul><ul><li>Separate system/software disk </li></ul></ul><ul><ul><li>Enable/Disable CPUs by editing grub.conf </li></ul></ul><ul><ul><li>£300 on eBay including all HDD and shipping </li></ul></ul>What Tests?
    16. 16. <ul><li>Enterprise SMP server – Sun E10K </li></ul><ul><ul><li>Solaris 8 </li></ul></ul><ul><ul><li>12 x 400Mhz SPARC </li></ul></ul><ul><ul><li>12Gb RAM </li></ul></ul><ul><ul><li>EMC Symmetrix 8730 via Brocade SAN </li></ul></ul><ul><ul><li>5 x Hard Disk Slices (Hypers) in RAID 1+0 (960Kb stripe) </li></ul></ul><ul><ul><li>Enable/Disable CPUs using psradm </li></ul></ul><ul><li>Yes, really ! </li></ul><ul><ul><li>We had some spare kit kicking around. (Thanks, Mike) </li></ul></ul><ul><li>DBA Lessons </li></ul><ul><ul><li>#1 - Always be nice to System and Storage Administrators </li></ul></ul><ul><ul><li>#2 – Work for companies with a lot of money </li></ul></ul>What Tests?
    17. 17. Test Scripts and Tools <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><li>Test Scripts and Tools </li></ul><ul><li>Test Results </li></ul><ul><li>When is a Conclusion … </li></ul>
    18. 18. Test Scripts and Tools <ul><li>init.ora </li></ul><ul><ul><li>Disabled parallel_adaptive_multi_user </li></ul></ul><ul><ul><li>Set parallel_max_servers to 512 </li></ul></ul><ul><ul><ul><li>I forgot to increase this a couple of times </li></ul></ul></ul><ul><ul><li>A stupid mistake in the paper (and an important lesson) </li></ul></ul><ul><ul><ul><li>Parallel_max_servers=512 keeps defaulting to 385? </li></ul></ul></ul><ul><ul><ul><li>processes=400 ! </li></ul></ul></ul><ul><li>Setup scripts </li></ul><ul><ul><li>To be able to recreate environment easily </li></ul></ul><ul><ul><li>setup1.sql – Tablespaces, user account and privs </li></ul></ul><ul><ul><li>setup2.sql – Create two 8 million row / 11Gb tables </li></ul></ul><ul><ul><li>setup3.sql – Create eight 1 million row / 147Mb tables. </li></ul></ul>
    19. 19. Test Scripts and Tools <ul><li>Test scripts </li></ul><ul><ul><li>To run selected SQL statements consistently across a range of DOPs, unattended. </li></ul></ul><ul><ul><li>rolling.sh – FTS and HJ/Sort against the big tables </li></ul></ul><ul><ul><li>session.sh – HJ/Sort of one big table and one of the smaller tables, accepting a session parameter so that multiple copies can run concurrently </li></ul></ul><ul><ul><li>multi.sh – Harness script that runs session.sh for a given number of users </li></ul></ul>
    20. 20. Test Scripts and Tools <ul><li>Information Collection </li></ul><ul><ul><li>Simple log file </li></ul></ul><ul><ul><ul><li>SQL statements </li></ul></ul></ul><ul><ul><ul><li>Output </li></ul></ul></ul><ul><ul><ul><li>Timings </li></ul></ul></ul><ul><ul><ul><li>Autotrace </li></ul></ul></ul><ul><ul><ul><li>v$pq_tqstat query after each statement </li></ul></ul></ul><ul><ul><li>10046 Trace File </li></ul></ul><ul><ul><ul><li>Consolidated version, using client_id and trcsess </li></ul></ul></ul><ul><ul><ul><li>tkprof output too </li></ul></ul></ul><ul><ul><ul><li>Watch the overhead in disk space and trcsess run time! </li></ul></ul></ul><ul><ul><li>System Statistics </li></ul></ul>
    21. 21. Test Scripts and Tools <ul><li>Operating System Statistics </li></ul><ul><ul><li>Resource Usage </li></ul></ul><ul><ul><li>Bottlenecks </li></ul></ul><ul><ul><li>Long-running tests – likely to be a lot of data! </li></ul></ul><ul><li>ORCA/orcallator </li></ul><ul><ul><li>http://www.orcaware.com/orca </li></ul></ul><ul><ul><li>Go for the latest development tarball, which includes </li></ul></ul><ul><ul><ul><li>procallator for Linux statistics collection </li></ul></ul></ul><ul><ul><li>Easy configuration to generate HTML output </li></ul></ul><ul><ul><li>Pretty graphs! </li></ul></ul><ul><ul><li>Lots of them in the paper, but not here. </li></ul></ul>
    22. 22. Test Results <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><li>Test Scripts and Tools </li></ul><ul><li>Test Results </li></ul><ul><li>When is a Conclusion … </li></ul>
    23. 23. PC – 1 CPU – 1.3Gb
    24. 24. ISP4400 – 1-4 CPUs – FTS 11Gb
    25. 25. ISP4400 – 1-4 CPUs – HJ 22Gb
    26. 26. E10K – 1-12 CPUs – FTS 11Gb
    27. 27. E10K – 1-12 CPUs – HJ 22Gb
    28. 28. Multi-user Tests <ul><li>First attempt </li></ul><ul><ul><li>Hash Join/Sort statement only </li></ul></ul><ul><ul><li>170Mb Tables – 128,000 rows (PCTFREE 90) </li></ul></ul><ul><ul><li>Between 1 and 12 concurrent users, noparallel to DOP 4 </li></ul></ul><ul><ul><li>Showed how quickly PX response drops off with multiple users </li></ul></ul><ul><ul><li>Then I noticed something strange in the V$PQ_TQSTAT output </li></ul></ul><ul><ul><li>Slaves weren’t doing much work. </li></ul></ul><ul><li>What’s that sound I can hear? </li></ul><ul><ul><li>PCTFREE 90 - lots of disk I/O (largely empty blocks) </li></ul></ul><ul><ul><li>Very small data volumes feeding into later stages of the plan! </li></ul></ul><ul><ul><li>Mmmmm …. Perhaps that doesn’t test the CPUs too well </li></ul></ul>
    29. 29. Multi-user Tests
    30. 30. Doh! <ul><li>If the CPUs weren’t working hard enough on the multi-user tests, then … </li></ul><ul><li>I should re-run the Single User/Volume Tests </li></ul>
    31. 31. Single User Volume Tests II
    32. 32. When is a Conclusion … <ul><li>Introduction </li></ul><ul><li>What is the Magic of ‘2’? </li></ul><ul><li>What Tests? </li></ul><ul><li>Test Scripts and Tools </li></ul><ul><li>Test Results </li></ul><ul><li>When is a Conclusion … </li></ul>
    33. 33. <ul><li>… not a Conclusion? </li></ul><ul><ul><li>When it contains lots of mights, maybes and coulds? </li></ul></ul><ul><ul><li>When you’ve been testing the wrong thing? </li></ul></ul><ul><li>IF you’re the only user of the server and it has more than one CPU and enough disks then </li></ul><ul><ul><li>You should definitely give PX at a DOP of 2 a try </li></ul></ul><ul><ul><li>Benefit from the direct path I/O, not the parallelism? </li></ul></ul><ul><ul><ul><li>_serial_direct_read=true </li></ul></ul></ul><ul><li>Benefits diminish rapidly </li></ul><ul><ul><li>If using an unsuitable disk configuration, like these tests </li></ul></ul><ul><ul><li>Then again, I think a lot of people are </li></ul></ul>When is a Conclusion …
    34. 34. <ul><li>The only way to know for sure is to test your SQL, with your data with a range of DOPs </li></ul><ul><ul><li>Then choose something below the apparent optimum? </li></ul></ul><ul><li>Parallel Execution loves hardware </li></ul><ul><ul><li>But it’s not just about having loads of kit </li></ul></ul><ul><ul><li>You need to have the right balance of CPU, Memory and I/O bandwidth </li></ul></ul><ul><ul><li>Bottlenecks will become apparent more quickly </li></ul></ul><ul><li>Don’t use it for online </li></ul><ul><ul><li>Unless it’s a handful of users </li></ul></ul><ul><ul><li>With a predictable maximum number of concurrent activities </li></ul></ul><ul><ul><li>Set parallel_adaptive_multi_user to TRUE? (10g default) </li></ul></ul><ul><ul><li>You must explain it to your users! </li></ul></ul>When is a Conclusion …
    35. 35. <ul><li>More things to try </li></ul><ul><ul><li>Bigger stripe widths and filesystem options ( DONE ) </li></ul></ul><ul><ul><li>Different extent and block sizes ( DONE ) </li></ul></ul><ul><ul><li>Disk-separated data files and Hash Partitioned Tables </li></ul></ul><ul><ul><li>Hardware RAID </li></ul></ul><ul><ul><li>Different Automatic PGA Management Settings </li></ul></ul><ul><ul><li>Oracle’s Default PX Parameter Values </li></ul></ul><ul><ul><li>Different SQL </li></ul></ul><ul><li>What have I started ?!? </li></ul><ul><ul><li>What price an old EMC Symmetrix on eBay? </li></ul></ul><ul><ul><li>Do you think Scottish Power do 3-phase power for domestic customers? </li></ul></ul><ul><ul><li>How will I explain the noise to Housemates and Partner! </li></ul></ul>When is a Conclusion …
    36. 36. <ul><li>The scripts are there </li></ul><ul><ul><li>http://oracledoug.com/px_slaves.doc </li></ul></ul><ul><ul><li>Tailor them to your needs. Improve them! </li></ul></ul><ul><ul><li>Let me know your results – I’m interested. </li></ul></ul><ul><ul><li>Including details of your environment </li></ul></ul><ul><ul><li>Data creation scripts </li></ul></ul><ul><ul><li>Your SQL </li></ul></ul>When is a Conclusion …
    37. 37. How Many Slaves? Parallel Execution and the Magic of 2 Doug Burns [email_address] http://oracledoug.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×