High Performance Plsql


Published on

Presentation given at RMOUG , Denver CO, Feb 17-18 2010 and Oracle Sydney meetup Feb 2011

Published in: Technology
  • It;s really informative. Thank you so much
    Are you sure you want to  Yes  No
    Your message goes here
  • Many thanks for the helpful presentation, Guy!

    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This presentation contains a wide variety of PL/SQL performance best practicesand tuning techniques, but we shouldn’t just pick one of these at random! Instead,our first step should be to identify the most resource-intensive lines ofPL/SQL code and start by optimizing that code.To do this, we use the PL/SQL profiler. The profiler is implemented in thepackage DBMS_PROFILER. When we surround a program call with START_PROFILER and STOP_PROFILER calls, Oracle collates execution statistics on aline-by-line basis.
  • The profiling data is stored in a collection of tables prefixed withPLSQL_PROFILER. The above is a query that reports the most expensivelines of code (in terms of execution time) in the profiling run
  • Much of the time identifying the most-expensive lines of code is sufficient to discoverhot spots and tuning opportunities. But on other occasions you need toidentify expensive subroutines, or identify the calling routine to understand the context in which a line of code is being executed. To help with these scenarios,Oracle introduced the hierarchical profiler in Oracle 11g.You access this profiler via the DBMS_HPROF package. START_PROFILING and STOP PROFILING procedures commence and terminate the profilingrun. The output from the profiling session will be written to the external file identifiedin the START_PROFILING call. If you want to load this file into database tablesfor analyses, you can do so by using the ANALYZE procedure.In this example, we profile the NIGHTLY_BATCH procedure to an externalfile hprof_trace.trc that is created in the HPROF_DIR directory. We then load thetrace file into the profiling tables using the ANALYZE procedure
  • There are two ways to analyze the trace file. First, the plshprof commandline utility converts the trace file into an html report. We can view the report by pointing our browser at the hprof_report.htmlfile generated by the preceding command. The figure above shows the report.
  • Personally, I find the HTML report a bit hard to interpret and prefer to issueSQL against the profiler tables. In the above example, a hierarchical self-join isissued that exposes the call tree
  • Consider a scenario in which an application accepts input from the enduser, reads some data in the database, decides what statement to execute next, retrievesa result, makes a decision, executes some SQL, and so on. If the applicationcode is written entirely outside of the database, each of these steps would requirea network round trip between the database and the application. The timetaken to perform these network trips can easily dominate overall user responsetime.The right hand diagram shows the sequences of interactions that would be required withouta stored procedure approach.On the other hand, if a stored program is used to implement the fundstransfer logic, only a single database interaction is required. The stored programtakes responsibility for checking balances, withdrawal limits, and soon. The figure on the left illustrates stored procedure network traffic.
  • Network round trips can also become significant when an application is requiredto perform some kind of aggregate processing on large record sets in thedatabase. If the application needs to (for example) retrieve millions of rows to calculatesome sort of business metric that cannot easily be computed using nativeSQL (average time to complete an order, for example), a large number of roundtrips can result. In such a case, the network delay can again become the dominantfactor in application response time. Performing the calculations in a stored programwill reduce network overhead, which might reduce overall response time.The key determining factor will be the network latency
  • The further the code is (in network terms) from the database, the more the network effects will magnify. You can’t get any closer to the database than being inde the database as PL/SQL
  • For the majority of PL/SQL programs, the biggest gains will be obtained by tuning the SQL within the PL/SQL.Next most important is usually to ensure that PL/SQL handles data efficiently: by using array fetch and insert, by using the right types of collections and best practices such as associative array lookups and NOCOPY parameters. Thirdly one should look at the algorithmic efficiency of the PL/SQL code: loop structure, ordering of conditionals, avoiding recursion and so on.Usually the least gain is obtained by tweaking parameters or data types. Two exception are:If PLSQL_OPTIMIZE_LEVEL is <2 (10g) or 3 (11g) then increasing it can sometimes lead to significant improvementsWhen doing number crunching, data types and native compilation are recommended
  • Most parameters have little effect, but PLSQL_OPTIMIZE_LEVEL has major effect on PLSQL code, similar to the effects of an optimizing compiler in other languages. Level 0: No optimizationLevel 1: Minor optimizations, not much reorganizationLevel 2: (the default) Significant reorganization including loop optimizations and automatic bulk collectLevel 3: (11g only) Further optimizations, notably automatic in-lining of subroutines
  • This script shows the amount of times routines spent in PL/SQL code compared to total execution timeThe procedure shown spends only 3% of it’s time executing PL/SQL code: the vast bulk of time is spent by SQL embedded within the PL/SQL.
  • If you are a TOAD or SQL*Navigator user, consider the Xpert editions or the Development suites. These include SQL optimizer which can help you find and tune SQL
  • PL/SQL can fetch rows from the database one at a time, all at once or in batches. The more rows fetched in a batch, the lower the CPU / logical read overhead, but the greater the memory requirementsThe forth way is not to retrieve it at all. I.e. Can you do all you work in SQL without pulling the rows into PL/SQL?
  • Prior to 10g this is inefficient:*Excessive loop iterations Increases logical reads (rows in the same block fetched separately)In other languages, this pattern causes excessive network traffic, but luckily in PL/SQL this is not an issue.
  • Selects all data in a single operationLarge result sets might take longer as memory growsOther concurrent sessions may have limited memory for sorts, etc. Out of memory errors are possibleIf you are connected via MTS and have Auto Shared Memory Management (ASMM) or Automatic Memory Management (AMM) you can actually use up all of the PGA and SGA memory.
  • Considered best:Never more that p_array_size elements in collectionBest throughput, acceptable memory utilization
  • See http://guyharrison.squarespace.com/blog/2009/9/18/mts-amm-bulk-collect-trouble.html
  • Note that this example is also vulnerable to SQL injection. Bind variables enable SQLs that are essentially identical, differing onlyin parameter values, to be parsed only once and then executed many times. Usingbind variables reduces parse overhead—a CPU-intensive operation—and also reducescontention for latches and mutexes that protect shared SQL structures inthe shared pool.In most programming languages, we have to go to special effort to use bindvariables, and sometimes the convoluted code that results can be tedious to writeand hard to maintain. However in PL/SQL, bind variables are employed automatically:Every PL/SQL variable inside a SQL statement is effectively a bindvariable and, therefore, PL/SQL programs rarely suffer from parse overhead andassociated latch/mutex contention issues that are all too common in languagessuch as PHP, Java, or C#.However, when we use dynamic SQL in PL/SQL, this automatic bindingdoes not occur: Dynamically constructed SQL in PL/SQL is just
  • This implementation defines the bind variable placeholder as :columnValuein the dynamic SQL string. The actual value is provided by the USING clause.Although we still request the parse every time this routine is executed, Oraclewill quite possibly find a matching SQL in the shared pool, providing onlythat the same table and column names have been used in a previous execution.
  • As well as this performance gain, we reduce the chance of mutex contention if there are other sessions are trying to execute similar routines.
  • Users of other programming languages might be familiar with the concept ofpassing a parameter by value as opposed to by reference. When we pass a parameterby value, we create a copy of the parameter for the subroutine to use. Whenwe pass the parameter by reference, the subroutine uses the actual variablepassed; any changes made to the variable in the subroutine are visible to the callingroutine.The NOCOPY directive instructs PL/SQL function or procedure to use theparameter variable directly, by reference, rather than making a copy. This is animportant optimization when passing large PL/SQL collections into a subroutinebecause otherwise the process of creating a copy can consume significant resources.
  • Prior to Oracle 9.2, when seeking a non-numeric value in such a cache, we might use two collections: oneof which contained keys and the other which contained values. For instance, inthe following example, we scan through the G_CUST_NAMES table looking for aspecific customer name and date of birth. birth. If we find a match, we look in the correspondingindex of the G_CUST_IDS table to find the CUSTOMER_ID.
  • Associative arrays offer a more efficient solution. An associativearray might be indexed by a non-numeric variable; so we can look up thematching customer name directly. And the code is simpler.
  • The LOOP–END LOOP clauses repeatedly execute statements within a loop. Becausestatements within a loop are executed many times, they can often consumea high proportion of overall execution time. A badly constructed loop can have adrastic effect on performance.The code above illustrates the principle of exiting a loop as early as possible.The code calculates the number of prime numbers less than the numberprovided as a parameter (P_NUM). It does this by attempting to divide the numberby every number smaller than the parameter.I originally wrote this code many years ago when comparing Java andPL/SQL performance. In the first version of this code, I omitted to include theEXIT statement included as a comment in line 16. The program worked but performedpoorly because it kept looping when it did not need to. NB: PLSQL_OPTIMIZE_LEVEL cannot help here.
  • Loop invariant expressions are those that don’t change with each iteration ofthe loop. For instance in the following code, the expressions on lines 5 and 7 remainunchanged with each iteration of the loop that begins on line 3. Recalculatingthese expressions with each iteration of the loop is unnecessary and wastesCPU cycles.
  • A recursive routine is one that invokes itself. Recursive routines often offer elegantsolutions to complex programming problems but tend to consume large amountsof memory and to be less efficient than iterative—loop based—alternatives.
  • Remember, the memory youscomes from a pool of memory shared between sessions. Other sessions might have reduced access to memory.
  • So far, we have used the Oracle NUMBER data type when performing numericcomputation. The NUMBER type is extremely flexible and capable of storing bothhigh-precision and high-magnitude numbers. However, this flexibility comes at acost when performing numeric computations: Certain numeric calculations willbe faster if a less flexible data type is chosen.In particular, the PLS_INTEGER and SIMPLE_INTEGER data types performfaster than the NUMBER type for computation. Both are signed 32 bit integers,which means that they can store numbers between –2,147,483,648 and2,147,483,648. SIMPLE_INTEGER is the same as PLS_INTEGER but cannot beNULL and overflows (attempts to store numbers larger than 2,147,483,648, for instance)will not cause exceptions to be raised. SIMPLE_INTEGER can offer a performanceadvantage when the PL/SQL package is compiled to native code.
  • Oracle enables you to create stored procedures in the Java language. Java-storedprocedures can outperform PL/SQL for number crunching, though the advantagesof Java for computation have been steadily decreasing with each release of Oracle.When Java was first introduced, performance gains of anywhere between 10 to 100times were achievable when rewriting computationally intensive PL/SQL routinesin Java. However, improvements in PL/SQL language efficiency, including someof the optimizations outlined in this chapter, have closed the gap.As with all computational optimizations, using Java to optimize numbercrunching operations is not generally advisable for PL/SQL programs that aredatabase-intensive. Efforts to optimize math operations for a PL/SQL programthat does mainly database operations are probably misdirected.Furthermore, a lot of the advantages that Java had over PL/SQL in previousreleases can be overcome in Oracle 11g by using efficient data types, SIMPLE_INTEGER, for example, and native compilation.
  • This example illustrates a drawback of function caching. The query is not strictly deterministic, and should be flushed on a daily basis. Oracle 11g introduced the result set cache, which allows entire result sets to becached in memory. SQL statements that perform expensive operations on relativelystatic tables can benefit tremendously from the result set cache.The function cache is a related facility that can benefit PL/SQL routines orSQL statements that call PL/SQL functions. Oracle 11g can store the results of a PL/SQL function in memory and, if the function is expensive to resolve, can retrievethe results for the function from memory more efficiently than by reexecutingthe function.You might want to use the function cache in the following circumstances:❏ You have a computationally expensive function that is deterministic: It willalways return the same results given the same inputs.❏ You have a database access routine encapsulated in a function that accessestables that are infrequently updated.
  • Einstein’s famous equation is encapsulated in the MASS_TO_ENERGY subroutinein lines 1–9; this subroutine is called multiple times from within the EMC2Bprocedure at line 17. This encapsulation represents good programming practice,especially if Einstein’s equation needs to be called from other routines. (Perhaps thispackage will be utilized in Larry Ellison’s upcoming intergalactic yacht.) However,the subroutine calls add overhead, and from a performance point of view, it wouldprobably be better to include the equation directly within the calling routine, like this:
  • With 11g in-lining, you can write your code using modularity and encapsulationwithout paying a performance penalty because Oracle can automatically moverelevant subroutines in-line to improve performance. The optimizer will performsome in-lining automatically if PLSQL_OPTIMZE_LEVEL=3. If you want to performin-lining when PLSQL_OPTIMIZE_LEVEL=2 (the default in Oracle 11g) oryou want to increase the likelihood of in-lining when PLSQL_OPTIMIZE_LEVEL=3.
  • Some of these and other topics are covered in Chapter 12 of my book “Oracle Performance Survival Guide”.
  • High Performance Plsql

    1. 1. High Performance PL/SQL Guy Harrison Director of Development, Melbourne www.guyharrison.net © 2010 Quest Software, Inc. ALL RIGHTS RESERVED
    2. 2. Introductions Buy Guy’s Book Buy Quest Products 2
    3. 3. Agenda • Measuring PL/SQL Performance • PL/SQL Performance advantages • SQL Code • PL/SQL Data handling • PL/SQL code structure • Compilation and datatype Tweaks 3
    4. 4. Measuring PL/SQL performance • DBMS_PROFILER is the best way to find PL/SQL “hot spots”: 4
    5. 5. Scripts at www.guyharrison.net 5
    6. 6. Toad Profiler support 6
    7. 7. SQL*Navigator profiler support 7
    8. 8. 11g Hierarchical profiler $ plshprof -output hprof demo1.trc 8
    9. 9. Plshprof output 9
    10. 10. DBMS_HPROF tables Scripts at www.guyharrison.net 10
    11. 11. When is PL/SQL faster? • PL/SQL routines most massively outperform other languages when network round trips are significant. 11
    12. 12. Network traffic • Routines that process large numbers of rows and return simple aggregates are also candidates for a stored procedure approach 12
    13. 13. Stored procedure alternative 13
    14. 14. Network traffic example 313 Java client 1703 Local Host Remote Host 297 Stored Procedure 344 0 200 400 600 800 1,000 1,200 1,400 1,600 1,800 Elapsed time (ms) 14
    15. 15. Aspects of PL/SQL performance 15
    16. 16. PLSQL_OPTIMIZE_LEVEL 3: 11g specific (in- lining, etc) 2: (default) Significantly reorganize code (array 1: Minor fetch, optimize (eliminate but loops) not reorganize) 0: No optimization 16
    17. 17. 17
    18. 18. It’s usually the SQL • Most PL/SQL routines spend most of their time executing SELECT statements and DML • SQL tuning is a big topic but: – Measure SQL overhead of PL/SQL routines first – Ensure best possible optimizer statistics – Consider adequacy of indexing – Learn how to use DBMS_XPLAN, SQL Trace, etc – Exploit 10g/11g tuning facilities (if licensed) – Don’t issue SQL when you don’t need to 18
    19. 19. SQL or PL/SQL? cachedPlsql.sql at guyharrison.net/opsg 19
    20. 20. 20
    21. 21. 21
    22. 22. Three ways of retrieving rows • One at a time Memory Requirements CPU & logical reads • In Batches • All at once . 22
    23. 23. One at a time 23
    24. 24. All at once 24
    25. 25. In batches 25
    26. 26. Array processing 200 180 160 140 No bulk collect Elapsed Time 120 100 80 60 Bulk collect without LIMIT 40 20 0 1 10 100 1000 10000 100000 1000000 Bulk Collect Size (Prior to 10g or PLSQL_OPTIMIZE_LEVEL <2) 26
    27. 27. Array processing • PLSQL_OPTIMIZE_LEVEL>1 causes transparent BULK COLLECT LIMIT 100 300 Bulk collect without LIMIT 250 200 Elasped time (s) 150 100 No bulk collect 50 0 1 10 100 1000 10000 100000 1000000 10000000 Array Size 10g or higher with PLSQL_OPTIMIZE_LEVEL >1 27
    28. 28. BULK COLLECT worst case scenario 28
    29. 29. Array Insert • No Array Insert • Insert all in single array: 29
    30. 30. Array Insert Performance FOR loop 341.86 FORALL 25.06 0 50 100 150 200 250 300 350 Elapsed (s) 30
    31. 31. Bind variables in Dynamic SQL • Using bind variables allows sharable SQL, reduces parse overhead and minimizes latch contention • Unlike other languages, PL/SQL uses bind variables transparently – EXCEPT when using Dynamic SQL: 31
    32. 32. Using bind variables 32
    33. 33. Bind variable performance • 10,000 calls No Binds 7.84 Bind variables 3.42 0 1 2 3 4 5 6 7 8 Elasped Time (s) 33
    34. 34. NOCOPY • The NOCOPY clause causes a parameter to be passed “by reference” rather than “by value” 34
    35. 35. NOCOPY performance gains • 4,000 row, 10 column “table”; 4000 lookups: 864.96 NO NOCOPY 0.28 NOCOPY 0 100 200 300 400 500 600 700 800 900 Elapsed time (s) 35
    36. 36. Associative arrays • Traditionally, sequential scans of PLSQL tables are used for caching database table data: 36
    37. 37. Associative arrays • Associative arrays allow for faster and simpler lookups: 37
    38. 38. Associative array performance • 10,000 random customer lookups with 55,000 customers Sequential scan 29.79 0.04 Associative lookups 0 5 10 15 20 25 30 Elapsed time (s) 38
    39. 39. 39
    40. 40. Reduce unnecessary Looping • Unnecessary loop iterations burn CPU Poorly formed loop 34.31 Well formed loop 3.96 0 5 10 15 20 25 30 35 Elapsed time (s) 40
    41. 41. Remove loop Invariant terms • Any term in a loop that does not vary should be extracted from the loop PLSQL_OPTIMIZE_LEVEL>1 does this automatically 41
    42. 42. Loop invariant performance improvements Original loop 11.09 Optimized loop 5.87 plsql_optimize_level=2 5.28 0 2 4 6 8 10 12 Elapsed time (s) 42
    43. 43. Recursion (see: recursion) • Recursive routines often offer elegant solutions*. • However, deep recursion is memory-intensive and usually not scalable * * Known In Australia as “smart-ass solutions” 43
    44. 44. Recursion memory overhead 1400 1200 1000 PGA memory (MB) 800 600 Recursive Non-recursive 400 200 0 0 2000000 4000000 6000000 8000000 10000000 Recursive Depth 44
    45. 45. 45
    46. 46. Number crunching (1) 17.64 NUMBER 20.09 3.83 PLS_INTEGER 7.06 NATIVE INTERPRETED 0.54 SIMPLE_INTEGER 5.99 0 5 10 15 20 25 Elasped Time (s) 46
    47. 47. Number crunching (2) PLSQL interpreted, NUMBER data type 47.22 PLSQL interpreted, PLS_INTEGER data type 14.48 PLSQL compiled, SIMPLE_INTEGER datatype 0.74 Java stored procedure 0.11 0 5 10 15 20 25 30 35 40 45 50 Elapsed time (s) 47
    48. 48. 11g Function cache • Suits deterministic but expensive functions OR • Expensive table lookups on non-volatile tables 48
    49. 49. Function cache performance • 100 executions, random date ranges 1-30 days: No function cache 5.21 Function cache 1.51 0 1 2 3 4 5 6 Elapsed time (s) 49
    50. 50. In-lining Modular design Manual in-lining 50
    51. 51. Automatic in-lining • PLSQL_OPTIMIZE_LEVEL = 3 • OR: No Inlining 4.95 Manual Inlining 2.56 PRAGMA INLINE 2.54 0 1 2 3 4 5 Elapsed time (s) 51
    52. 52. Other topics: • Explicit vs. implicit cursors • RETURNING clause • Pipelined functions • Optimizing triggers • Short circuit evaluations • IF and CASE comparison ordering 52
    53. 53. Thank You – Q&A Augusta Ada King, Countess of Lovelace 53