Top 10 Oracle SQL tuning tips


Published on

Design and develop with performance in mind
Establish a tuning environment
Index wisely
Reduce parsing
Take advantage of Cost Based Optimizer
Avoid accidental table scans
Optimize necessary table scans
Optimize joins
Use array processing
Consider PL/SQL for “tricky” SQL

Published in: Business, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • All too often, SQL tuning is performed on an application designed and developed with virtually no consideration given to performance requirements. SQL tuning is often the best option for tuning such applications, but it is both more efficient and effective to design an application with performance in mind. Producing good performance by design requires that the following activities occur within the development life-cycle:       Explicit specification of the performance characteristics of the system at an early stage of development.       Focus on critical transactions and queries during logical and physical modeling. Simulation of these SQL statements on prototype designs can often reveal key performance weaknesses.      Effective simulation of the target environment. This can include simulating concurrent user loads and/or acquiring representative data volumes.       Continual measurement of critical transaction performance. If possible, this should be integrated with other Quality Assurance measures.
  • It’s not uncommon for SQL that seemed to be working well in the development environment to exhibit poor performance once it is released to a production system. A primary cause of these unpleasant surprises is an inadequate development or volume testing environment. In particular, environments without realistic or representative data volumes are bound to lead to unrealistic SQL performance. The ideal tuning or development environment is one in which: Data volumes are realistic or at least proportional. With today’s increasingly large production databases it’s often not possible to duplicate production data volumes exactly. However, it should always be possible to construct a reasonable sub-set of the data. For instance, a 20% sample of a 1GB database may be adequate for performance tuning. At all costs, avoid the situation in which SQL developers are constructing and testing code against empty or almost empty tables - even the proverbial “SQL query from hell” will appear efficient in these environments. Tuning facilities are available. Supply your volume and development environments with as many tuning tools as you have available. This could involve third party SQL tuning tools but will at least involve enabling the default Oracle tools. Make sure developers know how to use EXPLAIN PLAN, SQL trace and tkprof. Make sure that relevant database options are set to enable their effective use (see Chapter 7). Documentation is available. This documentation should include the database design, index details, performance requirements and volume metrics. The SQL programmer needs all this information in order to effectively produce efficient SQL.
  • Image 3 slide Include name for graphic here. Graphic size = 300x354 pixels. X=5.08”, Y = 2.08”
  • Indexes exist primarily to improve the performance of SQL statements. In many cases, establishing the “best” indexes is easiest path to high performance. Use concatenated indexes Try not to use two indexes when one would do. If searching for SURNAME and FIRSTNAME, don’t necessarily create separate indexes for each column. Instead create a concatenated index on both SURNAME and FIRSTNAME. You can use the “leading” portions of a concatenated index on their own, so if you sometimes query on the SURNAME column without supplying the FIRSTNAME then SURNAME should come first in the index. Overindex to avoid a table lookup You can sometimes improve SQL execution by “overindexing”. Overindexing involves concatenating columns which appear in the SELECT clause, but not in the WHERE clause to the index. For instance, imagine we are searching on SURNAME and FIRSTNAME in order to find EMPLOYEE_ID. Our concatenated index on SURNAME and FIRSTNAME will allow us to quickly locate the row containing the appropriate EMPLOYEE_ID, but we will need to access both the index and the table. If there is an index on SURNAME, FIRSTNAME and EMPLOYEE_ID, then the query can be satisfied using the index alone. This technique can be particularly useful when optimizing joins, since intermediate tables in a join are sometimes queried merely to obtain the join key for the next table. Consider advanced indexing options Oracle default B*-tree indexes are flexible and efficient and are suitable for the majority of situations. However, Oracle offers a number of alternative indexing schemes that can improve performance in specific situations. Index clusters allow rows from one or more tables to be located in cluster key order. Clustering tables can result in a substantial improvement to join performance. However, table scans of an individual tables in the cluster can be severely degraded. Index clusters are usually only recommended for tables which are always accessed together. Even then, alternatives such as denormalization should be considered. In hash clusters , the key values are translated mathematically to a hash value. Rows are stored in the hash cluster based on this hash value. Locating a row when the hash key is known may require only a single I/O, rather than the 2-3 I/Os required by an index lookup. However, range scans of the hash key cannot be performed. Furthermore, if the cluster is poorly configured, or if the size of the cluster changes, then overflows on the hash keys can occur or the cluster can become sparsely populated. In the first case, hash key retrieval can degrade and in the second case table scans will be less efficient. Bit-mapped indexes suit queries on multiple columns made against multiple columns which each have relatively few distinct values. They are more compact that a corresponding concatenated index and, unlike the concatenated index, they can support queries in which the columns appear in any combination. However, bit-mapped indexes are not suitable for tables which high modification rates, since locking of bit-mapped indexes occurs at the block, rather than row level. In an Index-organized table , all table data is stored within a B*-tree index structure. This can improve access to data via the primary key, and reduces the redundancy of storing key values both in the index and in the table. Index-organized tables can be configured with infrequently accessed columns located in an overflow segment. This keeps the index structure relatively small and efficient. Make sure your query uses the best index Novice SQL programmers are often satisfied providing that the execution plan for their SQL statement uses any index. However, there will sometimes be a choice of indexed retrievals and the Oracle optimizer - especially the older Rule Based Optimizer - will not always choose the best index. Make sure that the indexes being selected by Oracle are the most appropriate and use hints (discussed below) to change the index if necessary
  • Image 2 slide Include name for graphic here. Graphic size = 612x282 pixels. X=0.92”, Y = 3.17”
  • Image 2 slide Include name for graphic here. Graphic size = 612x282 pixels. X=0.92”, Y = 3.17”
  • “ Parsing” an SQL statement is the process of validating the SQL and determining the optimal execution plan. For SQL which has low I/O requirements but is frequently executed (for example, the SQL generated by OLTP-type applications) reducing the overhead of SQL parsing is very important. When an Oracle session needs to parse an SQL statement, it first looks for an identical SQL statement in the Oracle shared pool . If a matching statement cannot be found then Oracle will determine the optimal execution plan for the statement and store the parsed representation in the shared pool. The process of parsing SQL is CPU intensive. When I/O is well-tuned, the overhead of parsing the SQL can be a significant portion of the total overhead of executing the SQL. There are a number of ways of reducing the overhead of SQL statement parsing: Use bind variables . Bind variables allow the variable part of a query to be represented by “pointers” to program variables. If you use bind variables, the text of the SQL statement will not change from execution to execution and Oracle will usually find a match in the shared pool, dramatically reducing parse overhead Re-use cursors. Cursors (or context areas) are areas of memory that store the parsed representation of SQL statements. If you re-execute the same SQL statement more than once, then you can re-open an existing cursor and avoid issuing a parse call altogether. The mechanism of re-using cursors varies across development tools and programming languages. Appendix C contains some guidelines for specific tools. Use a cursor cache . If your development tool makes it hard or impossible to re-use cursors, you can instruct Oracle to create a cursor “cache” for each session using the SESSION_CACHED_CURSORS init.ora parameter. If SESSION_CACHED_CURSORS is greater than 0, then Oracle will store that number of recently re-executed cursors in a cache. If an SQL statement is re-executed, it may be found in the cache and a parse call avoided.
  • The component of the Oracle software that determines the execution plan for an SQL statement is known as the optimizer . Oracle supports two approaches to query optimization. The rule based optimizer determines the execution plan based on a set of rules which rank various access paths. For instance, an index-based retrieval has a lower rank than a full table scan and so the rule based optimizer will use indexes wherever possible. The cost based optimizer determines the execution plan based on an estimate of the computer resources (“the cost”) required to satisfy various access methods. The cost based optimizer uses statistics including the number of rows in a table and the number of distinct values in indexes to determine this optimum plan. Early experiences with the cost based optimizer in Oracle7 were often disappointing and gave the cost based optimizer a poor reputation in some quarters. However, the cost based optimizer has been improving in each release while the rule based optimizer is virtually unchanged since Oracle 7.0. Many advanced SQL access methods, such as star and hash joins are only available when you use the cost based optimizer. The cost based optimizer is the best choice for almost all new projects and converting from rule to cost based optimization will be worthwhile for many existing projects. Consider then following guidelines for getting the most from the cost based optimizer: Optimizer_mode . The default mode of the cost based optimizer (optimizer_mode=CHOOSE) will attempt to optimize the throughput - time taken to retrieve all rows - of SQL statements and will often favor full table scans over index lookups. When converting to cost based optimization, many users are disappointed to find that previously well-tuned index lookups change to long running table scans. To avoid this, set OPTMIZER_MODE=FIRST_ROWS in init.ora or ALTER SESSION SET OPTIMIZER_GOAL=FIRST_ROWS in your code. This instructs the cost based optimizer to minimize the time taken to retrieve the first row in your result set and encourages the use of indexes. Hints . No matter how sophisticated the cost based optimizer becomes, there will still be occasions when you need to modify it’s execution plan. SQL “hints” are usually the best way of doing this. Using hints, you can instruct the optimizer to pursue your preferred access paths (such as a preferred index), use the parallel query option, select a join order and so on. Hints are entered as comments following the first word in an SQL statement. The plus sign “+” in the comment lets Oracle know that the comment contains a hint. Hints are fully documented in Appendix A. In the following example, a hint is instructing the optimizer to use the CUST_I2 index: SELECT /*+ INDEX(CUSTOMERS CUST_I2) */ * FROM CUSTOMERS WHERE NAME=:cust_name Analyze your tables . The Cost Based Optimizer’s execution plans are calculated using table statistics collected by the analyze command. Make sure you analyze your tables regularly, that you analyze all your tables and that you analyze them at peak volumes (for instance, don’t analyze a table just before it is about to be loaded by a batch job). For small-medium tables, use ANALYZE TABLE table_name COMPUTE STATISTICS, for, for larger tables take a sample such as ANALYZE TABLE table_name ESTIMATE STATISTICS SAMPLE 20 PERCENT. Use histograms . Prior to Oracle 7.3, the, the Cost Based Optimizer would have available the number of distinct values in a column but not the distribution of data within the column. This meant that it might decline to use an index on a column with only a few values even if the particular value in question was very rare and would benefit from an index lookup. Histograms, introduced in Oracle 7.3, allow, allow column distribution data to be collected and will allow the Cost Based Optimizer to make better decisions. You create histograms with the FOR COLUMNS clause of the analyze command (for instance ANALYZE TABLE table_name COMPUTE STATISTICS FOR ALL INDEXED COLUMNS). Note that you can’t take advantage of histograms if you are using bind variables (which we discussed earlier).
  • One of the most fundamental SQL tuning problems is the “accidental” table scan. Accidental table scans usually occur when the SQL programmer tries to perform a search on an indexed column that can’t be supported by an index. This can occur when: Using != (not equal to ). Even if the not equals condition satisfies only a small number of rows, Oracle will not use an index to satisfy such a condition. Often, you can re-code these queries using > or IN conditions, which can be supported by index lookups. Searching for NULLS . Oracle won’t use an index to find null values, since null values are not usually stored in an index (the exception is a concatenated index entry where only some of the values are NULL). If you’re planning to search for values which are logically missing, consider changing the column to NOT NULL with a DEFAULT clause. For instance, you could set a default value of “UNKNOWN” and use the index to find these values. Using functions on indexed columns . Any function or operation on an indexed column will prevent Oracle from using an index on that column. For instance, Oracle can’t use an index to find SUBSTR(SURNAME,1,4)=’SMIT’. Instead of manipulating the column, try to manipulate the search condition. In the previous example, a better formulation would be SURNAME LIKE ‘SMIT%’. In Oracle8i you can create functional indexes , which are indexes created on functions, providing that the function always returns the same result when provided with the same inputs. This allows you, for instance, to create an index on UPPER(surname).
  • In many cases, avoiding a full table scan by using the best of all possible indexes is your aim. However, it’s often the case that a full table scan cannot be avoided. In these situations, you could consider some of the following techniques to improve table scan performance: Use the parallel query optionParallel Query Option Oracle’s Parallel Query Option is the most effective - although most resource intensive - way of improving the performance of full table scans. Parallel Query allocates multiple processes to an SQL statement that is based at least partially on a full table scan. The table is partitioned into distinct sets of blocks, and each process works on a different set of data. Further processes may be allocated - or the original processes recycled - to perform joins, sorts and other operations. The approach of allocating multiple processes to the table scan can reduce execution time dramatically if the hardware and database layout is suitable. In particular, the host computer should have multiple CPUs and/or the database should be spread across more than one disk device. You can enable the Parallel Query option with a PARALLEL hint or make it the default for a table with the PARALLEL table clause. Reduce the size of the table The performance of a full table scan will generally be proportional to the size of the table to be scanned. There are ways of reducing the size of the table quite substantially and thereby improving full table scan performance. Reduce PCTFREE . The PCTFREE table setting reserves a certain percentage of space in each block to allow for updates that increase the length of a row. By default, PCTFREE is set to 10%. If your table is rarely updated, or if the updates rarely increase the length of the row, you can reduce PCTFREE and hence reduce the overall size of the table. Increase PCTUSED . The PCTUSED table setting determines at what point blocks that have previously hit PCTFREE will again become eligible for inserts. The default value is 40%, which means that after hitting PCTFREE, the block will only become eligible for new rows when deletes reduce the amount of used space to 40%. If you increase PCTUSED, rows will be inserted into the table at an earlier time, blocks will be fuller on average and the table will be smaller. There may be a negative effect on INSERT performance - you’ll have to assess the trade off between scan and insert performance. Relocate long or infrequently used columns . If you have LONG (or big VARCHAR2 ) columns in the table which are not frequently accessed and never accessed via a full table scan (perhaps a bitmap image or embedded document) you should consider relocating these to a separate table. By relocating these columns you can substantially reduce the table’s size and hence improve full table scan performance. Note that Oracle8 LOB types will almost always be stored outsidestored outside of the core table data anyway. The CACHE hint Normally, rows retrieved by most full table scans are flushed almost immediately from Oracle’s cache. This is sensible since otherwise full table scans could completely saturate the cache and push out rows retrieved from index retrievals. However, this does mean that subsequent table scans of the same table are unlikely to find a match in the cache and will therefore incur a high physical IO rate. You can encourage Oracle to keep these rows within the cache by using the CACHE hint or the CACHE table setting. Oracle will then place the rows retrieved at the Least Recently Used end of the LRU chain and they will persist in the cache for a much longer period of time. Use partitioning If the number of rows you want to retrieve from a table is greater than an index lookup could effectively retrieve, but still only a fraction of the table itself (say between 10% and 40% of total), you could consider partitioning the table. For instance, suppose that a SALES table contains all sales records for the past 4 years and you frequently need to scan all sales records for the current financial year in order to calculate year-to-date totals. The proportion or rows scanned is far greater than an index lookup would comfortably support but is still only a fraction of the total table. If you partition the table by financial year you can restrict processing to only those records that match the appropriate financial year. This could potentially reduce scan time by 75% or more. Partitioning is discussed in detail in Chapter 13. Consider fast full index scan If a query needs to access all or most of the rows in a table, but only a subset of the columns, you can consider using a fast full index scan to retrieve the rows. To do this, you need to create an index which contains all the columns included in the select and where clauses of the query. If these columns comprise only a small subset of the entire table then the index will be substantially smaller and Oracle will be able to scan the index more quickly than it could scan the table. There will, of course, be an overhead involved in maintaining the index that will affect the performance of INSERT, UPDATE and DELETE statements.
  • Determining the optimal join order and method is a critical consideration when optimizing the performance of SQL that involves multi-table joins. Join Method Oracle supports three join methods: In a nested loops join, Oracle performs a search of the inner table for each row found in the outer table. This type of access is most often seen when there is an index on the inner table—since, otherwise, multiple “nested” table scans may result. When performing a sort merge join, Oracle must sort each table (or result set) by the value of the join columns. Once sorted, the two sets of data are merged, much as you might merge two sorted piles of numbered pages. When performing a hash join, Oracle builds a hash table for the smaller of the two tables. This hash table is then used to find matching rows in a somewhat similar fashion to the way an index is used in a nested loops join. The Nested loops method suits SQL that joins together subsets of table data and where there is an index to support the join. When larger amounts of data must be joined and/or there is no index, use the sort-merge or hash-join method. Hash join usually out-performs sort-merge, but will only be used if Cost Based Optimization is in effect or if a hint is used. Join Order Determining the best join order can be a hit-and-miss affair. The Cost Based Optimizer will usually pick a good join order but if it doesn’t, you can use the ORDERED clause to force the tables to be joined in the exact order in which they appear in the FROM clause. In general, it is better to eliminate rows earlier rather than later in the join process, so if a table has a restrictive WHERE clause condition you should favor it earlier in the join process. Special Joins Oracle provides a number of optimization for “special” join types. Some of these optimizations will be performed automatically if you are using the Cost Based Optimizer, but all can be invoked in either optimizer by use of hints: The Star join algorithm optimizes the join of a single massive “fact” table to multiple, smaller, “dimension” tables. The optimization can be invoked with the STAR hint. A further optimization rewrites the SQL to take advantage of bitmap indexes that might exist in the fact table. This optimization can be invoked by the STAR_TRANSFORMATION hint. The “anti-join” is usually expressed as a subquery using the NOT IN clause. Queries of this type can perform badly under the Rule based optimizer, but can run efficiently if the init.ora parameter ALWAYS_ANTI_JOIN is set to “HASH” or if a HASH_AJ hint is added to the subquery. The “semi-join” usually expressed as a subquery using the EXISTS clause. This query may perform poorly if there is no index supporting the subquery, but can be run efficiently if the init.ora parameter ALWAYS_SEMI_JOIN is set to HASH or if a HASH_SJ hint is added to the subquery. Hierarchical queries using the CONNECT BY operator will degrade rapidly as table volumes increase unless there is an index to support the CONNECT BY join condition.
  • Semi joins allow Oracle to efficiently optimize queries which include an EXISTS subquery, but which don’t have an index to support efficient execution of the subquery. Typically, these statements would perform very badly in Oracle7, because the EXISTS clause causes the subquery to be executed once for each row returned by the parent query. If the subquery involves a table scan or inefficient index scan then performance would be expected to be poor. In Oracle7, the solution was either to create a supporting index, or re-code the statement as a join or as a subquery using the IN operator.
  • Array processing refers to Oracle’s ability to insert or select more than one row in a single operation. For SQL which deals with multiple rows of data, array processing usually results in reductions of 50% or more in execution time (more if you’re working across the network). In some application environments, array processing is implemented automatically and you won’t have to do anything to enable this feature. In other environments array processing may be the responsibility of the programmer. Appendix C outlines how array processing can be activated in some popular development tools. On the principle that “if some is good, more must be better”, many programmers implement huge arrays. This can be overkill and may even reduce performance by increasing memory requirements for the program. Most of the gains of array processing are gained by increasing the array size from 1 to about 20. Further increases result in diminishing gains and you won’t normally see much improvement when increasing the array size over 100.
  • SQL is a non-procedural language that is admirably suited for most data retrieval and manipulation tasks. However, there are many circumstances in which a procedural approach will yield better results. In these circumstances, the PL/SQL language (or possibly Java) can be used in place of standard SQL. Although it’s not possible to exhaustively categorize all the situations in which PL/SQL can be used in place of standard SQL, it’s possible that PL/SQL is a valid alternative when: There is little or no requirement to return large quantities of data. For instance, UPDATE transactions or when retrieving only a single value or row. Standard SQL requires more resources than seems logically required and no combination of hints seem to work. This is particularly likely if there are some implicit characteristics of the data which the optimizer cannot “understand” or where the SQL is particularly complex. You have a clear idea of how the data should be retrieved and processed, but can’t implement your algorithm using standard SQL. Some of the specific circumstances in which PL/SQL was found to improve performance within this book were: Determining 2 nd highest values. Performing range lookups for table that have a LOW_VALUE and HIGH_VALUE column. Performing correlated updates where the same table is referenced within the WHERE and SET clauses of an UPDATE statement with a subquery within the SET clause. PL/SQL triggers can also be invaluable when implementing de-normalization.
  • Top 10 Oracle SQL tuning tips

    1. 1. Oracle SQL tuning
    2. 2. Top 10 Oracle SQL tuning tips 1. Design and develop with performance in mind 2. Establish a tuning environment 3. Index wisely 4. Reduce parsing 5. Take advantage of Cost Based Optimizer 6. Avoid accidental table scans 7. Optimize necessary table scans 8. Optimize joins 9. Use array processing 10. Consider PL/SQL for “tricky” SQL
    3. 3. Hint #1: Design and develop with performance in mind  Explicitly identify performance targets  Focus on critical transactions – Test the SQL for these transactions against simulations of production data  Measure performance as early as possible  Consider prototyping critical portions of the applications  Consider de-normalization and other performance by design features early on
    4. 4. Hint #2: Establish a tuning and development environment  A significant portion of SQL that performs poorly in production was originally crafted against empty or nearly empty tables.  Make sure you establish a reasonable sub-set of production data that is used during development and tuning of SQL  Make sure your developers understand EXPLAIN PLAN and tkprof, or equip them with commercial tuning tools.
    5. 5. Understanding SQL tuning tools  The foundation tools for SQL tuning are: – The EXPLAIN PLAN command – The SQL Trace facility – The tkprof trace file formatter  Effective SQL tuning requires either familiarity with these tools or the use of commercial alternatives such as SQLab
    6. 6. EXPLAIN PLAN  The EXPLAIN PLAN reveals the execution plan for an SQL statement.  The execution plan reveals the exact sequence of steps that the Oracle optimizer has chosen to employ to process the SQL.  The execution plan is stored in an Oracle table called the “plan table”  Suitably formatted queries can be used to extract the execution plan from the plan table.
    7. 7. A simple EXPLAIN PLAN SQL> EXPLAIN PLAN FOR select count(*) from sales where product_id=1; Explained. SQL> SELECT RTRIM (LPAD (' ', 2 * LEVEL) || RTRIM (operation) ||' '||RTRIM (options) || ' ' || object_name) query_plan 2 FROM plan_table 3 CONNECT BY PRIOR id = parent_id 4* START WITH id = 0 QUERY_PLAN -------------------------------------------- SELECT STATEMENT SORT AGGREGATE TABLE ACCESS FULL SALES
    8. 8. Interpreting EXPLAIN PLAN  The more heavily indented an access path is, the earlier it is executed.  If two steps are indented at the same level, the uppermost statement is executed first.  Some access paths are “joined” – such as an index access that is followed by a table lookup.
    10. 10. SQL_TRACE and tkprof  ALTER SESSION SET SQL_TRACE TRUE causes a trace of SQL execution to be generated.  The TKPROF utility formats the resulting output.  Tkprof output contains breakdown of execution statistics, execution plan and rows returned for each step. These stats are not available from any other source.  Tkprof is the most powerful tool, but requires a significant learning curve.
    11. 11. Tkprof output count2 cpu3 elapsed4 disk5 query6 current7 rows8 ------ ------ ------ -------- ------- -------- -------- ------ Parsea 1d 0.02 0.01 0 0 0 0 Executeb 1e 0.00 0.00 0 0 0 0 Fetchc 20j 141.10 141.65 1237 1450011 386332 99i ------ ------ ------ -------- ------- -------- -------- ------ total 22 141.12 141.66 1237k 1450011f 386332g 99h Rowsl Execution Planm ------- --------------------------------------------------- 0 SELECT STATEMENT GOAL: CHOOSE 99 FILTER 96681 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'CUSTOMERS' 96582 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'EMPLOYEES'
    12. 12. Using SQLab  Because EXPLAIN PLAN and tkprof are unwieldy and hard to interpret, third party tools that automate the process and provide expert advice improve SQL tuning efficiency.  The Quest SQLab product: – Identifies SQL your database that could benefit from tuning – Provides a sophisticated tuning environment to examine, compare and evaluate execution plans. – Incorporates an expert system to advise on indexing and SQL statement changes
    13. 13. SQLab SQL tuning lab – Display execution plan in a variety of intuitive ways – Provide easy access to statistics and other useful data – Model changes to SQL and immediately see the results
    14. 14. SQLab Expert Advice – SQLab provides specific advice on how to tune an SQL statement
    15. 15. SQLab SQL trace integration – SQLab can also retrieve the execution statistics that are otherwise only available through tkprof
    16. 16. Hint #3: Index wisely  Index to support selective WHERE clauses and join conditions  Use concatenated indexes where appropriate  Consider overindexing to avoid table lookups  Consider advanced indexing options – Hash Clusters – Bit mapped indexes – Index only tables
    17. 17. Effect of adding columns to a concatenated index – Novice SQL programmers often are satisfied if the execution plan shows an index – Make sure the index has all the columns required 4 6 20 40 700 0 100 200 300 400 500 600 700 800 Logical IO Index on Surname+firstname+dob+phoneo Index on Surname+firstname+DOB Index on Surname+firstname Merge 3 indexes Surname index only
    18. 18. Bit-map indexes – Contrary to widespread belief, can be effective when there are many distinct column values – Not suitable for OLTP however 0.01 0.1 1 10 100 1 10 100 1,000 10,000 100,000 1,000,000 Distinct values Elapsedtime(s) Bitmap index B*-Tree index Full table scan
    19. 19. Hint #4: Reduce parsing  Use bind variables – Bind variables are key to application scalability – If necessary in 8.1.6+, set cursor CURSOR_SHARING to FORCE  Reuse cursors in your application code – How to do this depends on your development language  Use a cursor cache – Setting SESSION_CACHED_CURSORS (to 20 or so) can help applications that are not re-using cursors
    20. 20. Hint #5: Take advantage of the Cost Based Optimizer  The older rule based optimizer is inferior in almost every respect to the modern cost based optimizer  Using the cost based optimizer effectively involves: – Regular collection of table statistics using the ANALYZE or DBMS_STATS command – Understand hints and how they can be used to influence SQL statement execution – Choose the appropriate optimizer mode: FIRST_ROWS is best for OLTP applications; ALL_ROWS suits reporting and OLAP jobs
    21. 21. Hint #6: Avoid accidental tablescans  Tablescans that occur unintentionally are a major source of poorly performing SQL. Causes include: – Missing Index – Using “!=“, “<>” or NOT • Use inclusive range conditions or IN lists – Looking for values that are NULL • Use NOT NULL values with a default value – Using functions on indexed columns • Use “functional” indexes in Oracle8i
    22. 22. Hint #7: Optimize necessary table scans  There are many occasions where a table scan is the only option. If so: – Consider parallel query option – Try to reduce size of the table • Adjust PCTFREE and PCTUSED • Relocate infrequently used long columns or BLOBs • Rebuild when necessary to reduce the high water mark – Improve the caching of the table • Use the CACHE hint or table property • Implement KEEP and RECYCLE pools – Partition the table (if you really seek a large subset of data) – Consider the fast full index scan
    23. 23. Fast full index scan performance – Use when you must read every row, but not every column – Counting the rows in a table is a perfect example 2.44 4.94 5.23 12.53 17.76 19.83 0 5 10 15 20 Elapsed time (s) Parallel fast full index fast full index Parallel table scan Full table scan Full index scan Index range scan (RULE)
    24. 24. Hint #8: Optimize joins  Pick the best join method – Nested loops joins are best for indexed joins of subsets – Hash joins are usually the best choice for “big” joins  Pick the best join order – Pick the best “driving” table – Eliminate rows as early as possible in the join order  Optimize “special” joins when appropriate – STAR joins for data-warehousing applications – STAR_TRANSFORMATION if you have bitmap indexes – ANTI-JOIN methods for NOT IN sub-queries – SEMI-JOIN methods for EXISTS sub-queries – Properly index CONNECT BY hierarchical queries
    25. 25.  Optimizes queries using EXISTS where there is no supporting index select * from customers c where exists (select 1 from employees e where e.surname=c.contact_surname and e.firstname=c.contact_firstname and e.date_of_birth=c.date_of_birth) Oracle 8 semi-joins No index on employees
    26. 26. Oracle 8 semi-joins  Without the semi-join or supporting index, queries like the one on the preceding slide will perform very badly.  Oracle will perform a tablescan of the inner table for each row retrieved by the outer table  If customers has 100,000 rows, and employees 800 rows then 80 MILLION rows will be processed!  In Oracle7, you should create the index or use an IN- based subquery  In Oracle8, the semi-join facility allows the query to be resolved by a sort-merge or hash join.
    27. 27. To Use semi-joins  Set ALWAYS_SEMI_JOIN=HASH or MERGE in INIT.ORA, OR  Use a MERGE_SJ or HASH_SJ hint in the subquery of the SQL statement SELECT * FROM customers c WHERE exists (select /*+merge_sj*/ 1 from employees e where ….)
    28. 28. Oracle8 semi-joins  The performance improvements are impressive (note the logarithmic scale) 6.69 6.83 31.01 1,343.19 1 10 100 1,000 10,000 Elapsed time (logarithmic scale) IN-based subquery EXISTS - merge semi-join EXISTS no semi-join but with index EXISTS no semi-join or indexes
    29. 29. Star Join improvements  A STAR join involves a large “FACT” table being joined to a number of smaller “dimension” tables
    30. 30. Star Join improvements  The Oracle7 Star join algorithm works well when there is a concatenated index on all the FACT table columns  But when there are a large number of dimensions, creating concatenated indexes for all possible queries is impossible.  Oracle8’s “Star transformation” involves re-wording the query so that it can be supported by combinations of bitmap indexes.  Since bitmap indexes can be efficiently combined, a single bitmap index on each column can support all possible queries.
    31. 31. To enable the star transformation  Create bitmap indexes on each of the FACT table columns which are used in star queries  Make sure that STAR_TRANSFORMATION_ENABLED is TRUE, either by changing init.ora or using an ALTER SESSION statement.  Use the STAR_TRANSFORMATION hint if necessary.
    32. 32. Drawback of Star transformation  Bitmap indexes reduce concurrency (row-level locking may break down).  But remember that large number of distinct column values may not matter
    33. 33. Star transformation performance  When there is no suitable concatenated index, the Star transformation results in a significant improvement 0.01 0.24 9.94 0.35 0 1 2 3 4 5 6 7 8 9 10 Elapsed time (s) Concatenated index No suitable concatenated index Star_transformation Star
    34. 34. Hint #9: Use ARRAY processing – Retrieve or insert rows in batches, rather than one at a time. – Methods of doing this are language specific 0 10 20 30 40 50 60 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 Array size Elapsedtime
    35. 35. Hint #10: Consider PL/SQL for “tricky” SQL  With SQL you specify the data you want, not how to get it. Sometime you need to specifically dictate your retrieval algorithms.  For example: – Getting the second highest value – Doing lookups on a low-high lookup table – Correlated updates – SQL with multiple complex correlated subqueries – SQL that seems to hard to optimize unless it is broken into multiple queries linked in PL/SQL
    36. 36. Oracle8i PL/SQL Improvements – Array processing – NOCOPY – Temporary tables – The profiler – Dynamic SQL
    37. 37. Bonus hint: When your SQL is tuned, look to your Oracle configuration  When SQL is inefficient there is limited benefit in investing in Oracle server or operating system tuning.  However, once SQL is tuned, the limiting factor for performance will be Oracle and operating system configuration.  In particular, check for internal Oracle contention that typically shows up as latch contention or unusual wait conditions (buffer busy, free buffer, etc)