Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Exploring Oracle Database Performance Tuning Best Practices for DBAs and Developers


Published on

  • Be the first to comment

Exploring Oracle Database Performance Tuning Best Practices for DBAs and Developers

  1. 1. Exploring Oracle Database Performance Tuning Best Practices for DBAs and Developers
  2. 2. •38 years old * Married + 3 * 15 years AS a dba, consultant, instructor, architect. * CEO @ DBcs ltd. * Was cto @ johnbryce israel * Oracle certified professional * Microsoft sql server certified professional • • About Me :
  3. 3. Agenda • Oracle Database Architecture Overview • The connection between SQL tuning & Instance tuning • The connection between database & operating system • Common bottlenecks - Drill down • How do you identify the source of the problem? • Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache • Solutions: where do you start and what order to work? • Introduction to SQL and Application Tuning • The Oracle Optimizer: • Rule Based Optimization (overview) • Cost Based Optimization • The Different Modes of the Cost Based Optimizer • Execution Plans • Data Access Methods • Indexes – Types, Classifications, Advantages & Disadvantages • Sort Usage Guidelines • When and What to Tune? • Clustering factor • Data Types are Important • Integrity Constrains are Important • Reasons for Inefficient SQL Performance • Using Bind Variables • Restructuring SQL Statements • Shared SQL and Cursors • Advanced SQL and Application Topics
  4. 4. "You have to be constantly evolving and in some cases DBAs/Programmers don’t do that because they know how they did it years ago and they want to keep doing it that way..."
  5. 5. Quote from Thomas Kyte's book if you want a 10 step guide to tuning a query, buy a piece of software. You are not needed in this process, anyone can put a query in, get a query out and run it to see if it is faster. There are tons of these tools on the market. They work using rules (heuristics) and can tune maybe 1% of the problem queries out there. They APPEAR to be able to tune a much larger percent but that is only because the people using these tools never look at the outcome -- hence they continue to make the same basic mistakes over and over and over. If you want to really be able to tune the other 99% of the queries out there, knowledge of lots of stuff -- physical storage mechanisms, access paths, how the optimizer works -that's the only way. .. .. Think about it for a moment. If there were a 10 step or even 1,000,000 step process by which any query can be tuned (or even X% of queries for that matter), we would write a program to do it. Oh don't get me wrong, there are many programs that actually try to do this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and others. What they do is primarily recommend indexing schemes to tune a query, suggest materialized views, offer to add hints to the query to try other access plans. They show you different query plans for the same statement and allow you to pick one. They offer "rules of thumb" (what I generally call ROT since the acronym and the word is maps to are so appropriate for each other) SQL optimizations - which if they were universally applicable - the optimizer would do it as a matter of fact. In fact, the cost based optimizer does that already - it rewrites our queries all of the time. These tuning tools use a very limited set of rules that sometimes can suggest that index or set of indexes you really should have thought of during your design.
  6. 6. Oracle Database Architecture Overview
  7. 7. Oracle Database Architecture Overview
  8. 8. Oracle Database Memory Structures: Overview Background process Server process Server process SGA Redo log buffer Database buffer cache Java pool Streams pool Shared pool Large pool Aggregated PGA … … …
  9. 9. Database Buffer Cache • Is a part of the SGA • Holds copies of data blocks that are read from data files • Is shared by all concurrent processes Database writer process Database buffer cache SGA Data files DBWn Server process
  10. 10. Redo Log Buffer • Is a circular buffer in the SGA (based on the number of CPUs) • Contains redo entries that have the information to redo changes made by operations, such as DML and DDL Log writer process Redo log buffer SGA Redo log files LGWR Server process
  11. 11. SGA Shared Pool • Is part of the SGA Contains: • Library cache • Shared parts of SQL and PL/SQL statements • Data dictionary cache • Result cache: • SQL queries • PL/SQL functions • Control structures • Locks Library cache Data dictionary cache (row cache) Control structures Result cache Server process Shared pool
  12. 12. Processing a DML Statement: Example Database Data files Control files Redo log files User process Shared pool Redo log buffer Server process 3 5 1 Library cache 2 4 Database buffer cache DBWn SGA 2
  13. 13. COMMIT Processing: Example Database Data files Control files Redo log files User process SGA Shared pool Redo log buffer Server process 1 3 Library cache Database buffer cache DBWn 2LGWR SGA
  14. 14. Program Global Area (PGA) PGA is a memory area that contains: • Session information • Cursor information • SQL execution work areas: • Sort area • Hash join area • Bitmap merge area • Bitmap create area • Work area size influences SQL performance. • Work areas can be automatically or manually managed. Stack Space User Global Area (UGA) User Session Data Cursor Status SQL Area Server process
  15. 15. Background Process Roles PMON SMON ARCnDBWn LGWRCKPT Database buffer cache Shared poolSGA Redo log buffer MMON CJQ0 QMNnRCBG MMAN
  16. 16. SGA Java pool Fixed SGA Redo log buffer Database buffer cache Automatic Shared Memory Management Which size to choose? Large poolShared pool Streams pool SGA_TARGET + STATISTICS_LEVEL Automatically tuned SGA components
  17. 17. Automated SQL Execution Memory Management Background process Server process Server process … … … PGA_AGGREGATE_TARGET Which size to choose? Aggregated PGA
  18. 18. Automatic Memory Management • Sizing of each memory component is vital for SQL execution performance. • It is difficult to manually size each component. • Automatic memory management automates memory allocation of each SGA component and aggregated PGA. Buffercache Largepool Sharedpool Javapool Streamspool Private SQLareas OtherSGA Untunable PGA Free MEMORY_TARGET+STATISTICS_LEVEL MMAN
  19. 19. The connection between SQL tuning & Instance tuning
  20. 20. Database tuning is the process of tuning the actual database, which encompasses the allocated memory, disk usage, CPU, I/O, and underlying database processes. Tuning a database also involves the management and manipulation of the database structure itself, such as the design and layout of tables and indexes. Additionally, database tuning often involves the modification of the database architecture in order to optimize the use of the hardware resources available. There are many other considerations when tuning a database, but these tasks are normally accomplished by the database administrator. The objective of database tuning is to ensure that the database has been designed in a way that best accommodates expected activity within the database. The connection between SQL tuning & Instance tuning
  21. 21. SQL tuning is the process of tuning the SQL statements that access the database. These SQL statements include database queries and transactional operations such as inserts, updates, and deletes. The objective of SQL statement tuning is to formulate statements that most effectively access the database in its current state, taking advantage of database and system resources and indexes. The connection between SQL tuning & Instance tuning
  22. 22. Both database tuning and SQL statement tuning must be performed to achieve optimal results when accessing the database. A poorly tuned database may very well render wasted effort in SQL tuning, and vice versa. Ideally, it is best to first tune the database, and then ensure that indexes exist where needed, and then tune the SQL code. The connection between SQL tuning & Instance tuning
  23. 23. The connection between database & operating system
  24. 24. Question: We are in the process of adopting Oracle and we have many choices of operating system platforms. Which OS is best for Oracle and how to I compare operating system environments for Oracle databases? Answer: That's a very common question. Oracle dominates the database world in part because it runs on over 60 platforms, everything from a Mainframe to a Mac. Oracle chose Solaris as their preferred OS in 2005, and later decided to work on their own Linux distro, making a Oracle Linux OS that is custom-tailored to the needs of a typical database. Oracle leverages on the advantages of all OS platforms with an independent OSI, customized to each platform. As to which UNIX dialect is "best", it's often related to the server environment. For example, svmon is only available on IBM AIX. . . . Some, operating systems are better at managing large volumes of data, such as SUSE, who developed a special kernel, just for Oracle The connection between database & operating system
  25. 25. Data integrity features (T10 Protection Information ) Protection Information enables applications or kernel subsystems to attach metadata to I/O operations, allowing devices that support PI to verify the integrity before passing them further down the stack and physically committing them to disk. Data Integrity Extensions or DIX is a hardware feature that enables exchange of protection metadata between host operating system and HBA and helps to avoid corrupt data from being written, allowing a full end-to-end data integrity check. The connection between database & operating system
  26. 26. Zero downtime updates Make updates to the Linux Operating System (OS) kernel, while it is running, without a reboot or any interruption. Only Oracle Linux offers this unique capability, making it possible to keep up with important Linux kernel updates without burdening you with the operational cost and disruption of rebooting for every update to the kernel. Ksplice allows system administrators to deliver valuable patches for both the Unbreakable Enterprise Kernel as well as the Red Hat compatible kernel with lower costs, less downtime, increased security, and greater flexibility and control. The connection between database & operating system
  27. 27. Btrfs File System Btrfs (B-tree file system) is the “next generation file system” for Linux. Pronounced as “Butter FS” or “B-tree FS”, it is a GPL licensed file system first developed by Oracle’s Chris Mason in 2007. Btrfs provides a number of features that make it a very attractive file system solution for local disk storage. Btrfs is designed for: • Large files and file systems from the ground up • Simplified administration • Integrated RAID and volume management • Snapshots • Checksums for data and meta-data The connection between database & operating system
  28. 28. 10 common performance issues Common bottlenecks - Drill down
  29. 29. Not every suggestion is a good suggestion.” Even if its from the Software provider himself .” Aaron Shilo  Common bottlenecks - Drill down Once upon a time, Oracle Support had a note called Script: Lists All Indexes that Benefit from a Rebuild (Doc ID 122008.1) which lets just say I didn’t view in a particularly positive light :-) Mainly because it gave dubious advice which included that indexes should be rebuilt if: Deleted entries represent 20% or more of current entries The index depth is more than 4 levels It then detailed a script that ran a Validate Structure across all indexes in the database that didn’t belong in either the SYS or SYSTEM schema. This script basically read through and sequentially locked all tables (maybe multiple times) in the database in order to list indexes that might not actually need a rebuild while potentially missing out on some that do. I could write a script that achieved the same result with far less overheads. For example, SELECT index_name FROM DBA_INDEXES where index_name like ‘A%’ and owner not in (‘SYS’, ‘SYSTEM’) would achieve a very similar result  Posted by Richard Foote in Doc 122008.1, Doc 989093.1, Index Rebuild, Oracle Indexes
  30. 30. Bad connection management • The application connects and disconnects for each database interaction. • This problem is common with stateless middleware in application servers. • It has over two orders of magnitude impact on performance, and is totally unscalable. Common bottlenecks - Drill down
  31. 31. Bad connection management solution. • Turn on Connection Pooling • Database Resident Connection Pool (DRCP) in Oracle Database 11g • Fallback when there is no application tier connection pooling • Also useful for sharing connections across middle tier hosts • Supported only for OCI and PHP applications • Scales to tens of thousands of database connections even on a commodity box • Enable with dbms_connection_pool.start_pool • Connect String • Easy Connect: //localhost:1521/oowlab:POOLED • TNS Connect String: (SERVER=POOLED) Common bottlenecks - Drill down
  32. 32. Bad use of cursors and the shared pool • Not using cursors results in repeated parses. • If bind variables are not used, then there is hard parsing of all SQL statements. • This has an order of magnitude impact in performance, and it is totally unscalable. • Use cursors with bind variables that open the cursor and execute it many times. • Be suspicious of applications generating dynamic SQL. Common bottlenecks - Drill down
  33. 33. Bad use of cursors and the shared pool solution Bind Variables in Java • Instead of: String query = "SELECT EMPLOYEE_ID, LAST_NAME, SALARY FROM " +"EMPLOYEES WHERE EMPLOYEE_ID = " + generateNumber(MIN_EMPLOYEE_ID,MAX_EMPLOYEE_ID); pstmt = connection.prepareStatement(query); rs = pstmt.executeQuery(); • Change to: String query = "SELECT EMPLOYEE_ID, LAST_NAME, SALARY FROM " +"EMPLOYEES WHERE EMPLOYEE_ID = ?"; pstmt = connection.prepareStatement(query); pstmt.setInt(1, n); rs = pstmt.executeQuery(); Common bottlenecks - Drill down
  34. 34. Bad use of cursors and the shared pool solution Bind Variables in OCI static char *MY_SELECT = "select employee_id, last_name, salary from employees where employee_id = :EMPNO"; OCIBind *bndp1; OCIStmt *stmthp; ub4 emp_id; OCIStmtPrepare2 (svchp, &stmthp, /* returned stmt handle */ errhp, /* error handle */ (const OraText *) MY_SELECT, strlen((char *) MY_SELECT), NULL, 0, /* tagging parameters:optional */ OCI_NTV_SYNTAX, OCI_DEFAULT); /* bind input parameters */ OCIBindByName(stmthp, &bndp1, errhp, (text *) ":EMPNO", -1, &(emp_id), sizeof(emp_id), SQLT_INT, NULL, NULL, NULL, 0, NULL, OCI_DEFAULT); Common bottlenecks - Drill down
  35. 35. Bad SQL • Bad SQL is SQL that uses more resources than appropriate for the application requirement. • This can be a decision support systems (DSS) query that runs for more than 24 hours, or a query from an online application that takes more than a minute. • You should investigate SQL that consumes significant system resources for potential improvement. • ADDM identifies high load SQL. • SQL Tuning Advisor can provide recommendations for improvement. Common bottlenecks - Drill down
  36. 36. Bad SQL partial solution SQL Tuning Advisor is SQL diagnostic software in the Oracle Database Tuning Pack. You can submit one or more SQL statements as input to the advisor and receive advice or recommendations for how to tune the statements, along with a rationale and expected benefit. sql tuning advisor.sql Common bottlenecks - Drill down
  37. 37. Bad SQL partial solution The SQL Access Advisor can automatically analyze the schema design for a given workload and recommend indexes, function-based indexes, partitions, and materialized views to create, retain,or drop as appropriate for the workload. For single statement scenarios, the advisor only recommends adjustments that affect the current statement. For complete business workloads, the advisor makes recommendations after considering the impact on the entire workload. sql access advisor.sql Common bottlenecks - Drill down
  38. 38. Use of nonstandard initialization parameters • These might have been implemented based on poor advice or incorrect assumptions. • Most databases provide acceptable performance using only the set of basic parameters. • In particular, parameters associated with SPIN_COUNT on latches and undocumented optimizer features can cause a great deal of problems that can require considerable investigation. • Likewise, optimizer parameters set in the initialization parameter file can override proven optimal execution plans • For these reasons, schemas, schema statistics, and optimizer settings should be managed as a group to ensure consistency of performance. Common bottlenecks - Drill down
  39. 39. So here’s a little Oracle Fudge for you! 12c has parameters with the word fudge in them. _nested_loop_fudge _parallelism_cost_fudge_factor _px_broadcast_fudge_factor _query_rewrite_fudge These “fudge” parameters have been around with the same default values since at least 8.1.7. Common bottlenecks - Drill down
  40. 40. Use of nonstandard initialization parameters: Know your stuff!!! • Oracle takes this issue seriously: OEM policy checks this in fact • Use of Non-Standard Initialization Parameters • This policy checks for use of non-standard initialization parameters. • I've encountered a few during my work on prod systems, here are two examples. • db_file_multiblock_read_count • optimizer_index_caching • optimizer_index_cost_adj non standard parameters.sql Common bottlenecks - Drill down
  41. 41. Getting database I/O wrong • Many sites lay out their databases poorly over the available disks. • Other sites specify the number of disks incorrectly, because they configure disks by disk space and not I/O bandwidth. Common bottlenecks - Drill down
  42. 42. Getting database I/O wrong : general solutions Stripe Everything Across Every Disk The simplest approach to I/O configuration is to build one giant volume, striped across all available disks. To account for recoverability, the volume is mirrored (RAID 1). The striping unit for each disk should be larger than the maximum I/O size for the frequent I/O operations. This provides adequate performance for most cases. Common bottlenecks - Drill down
  43. 43. Getting database I/O wrong : general solutions Common bottlenecks - Drill down AdvantagesDisadvantagesBlock Size Good for small rows with lots of random access.Has relatively large space overhead due to metadata (that is, block header). Smaller Reduces block contention. Not recommended for large rows. There might only be a few rows stored for each block, or worse, row chaining if a single row does not fit into a block, Has lower overhead, so there is more room to store data.Wastes space in the buffer cache, if you are doing random access to small rows and have a large block size. For example, with an 8 KB block size and 50 byte row size, you waste 7,950 bytes in the buffer cache when doing random access. Larger Permits reading a number of rows into the buffer cache with a single I/O (depending on row size and block size). Not good for index blocks used in an OLTP environment, because they increase block contention on the index leaf blocks. Good for sequential access or very large rows (such as LOB data).
  44. 44. Online redo log setup problems • Many sites run with too few online redo log files and files that are too small. • Small redo log files cause system checkpoints to continuously put a high load on the buffer cache and I/O system. • If too few redo log files exist, then the archive cannot keep up, and the database must wait for the archiver to catch up. Common bottlenecks - Drill down All online redo log files should be the same size and configured to switch approximately once an hour during normal activity. They should switch no more frequently than every 20 minutes during peak activity. There should be a minimum of four online log groups to prevent LGWR from waiting for a group to be available following a log switch. A group may be unavailable because a checkpoint has not yet completed or the group has not yet been archived.
  45. 45. Serialization • Serialization of data blocks in the buffer cache due to lack of free lists, free list groups, transaction slots (INITRANS), or shortage of rollback segments. • This is particularly common on INSERT-heavy applications, in applications that have raised the block size above 8K, or in applications with large numbers of active users and few rollback segments. • Use automatic segment-space management (ASSM) and automatic undo management to solve this problem. Common bottlenecks - Drill down
  46. 46. Serialization : Easy  solution Oracle records a 'latch miss' when a process must wait for a latch to become available, and we also see 'buffer busy waits' when a process must wait for a freelist. You can reduce buffer busy waits by adding additional FREELISTS or FREELIST GROUPS. For low-update databases you can also implement bitmap freelists (ASSM, Automatic Segment Storage Management) with the create tablespace clause 'segment space management auto'. Common bottlenecks - Drill down
  47. 47. Serialization of another kind…. A look into Transaction Serialization “To describe consistent transaction behavior when transactions run concurrently, database researchers have defined a transaction isolation model called serializability. A serializable transaction operates in an environment that makes it appear as if no other users were modifying data in the database.” Common bottlenecks - Drill down
  48. 48. Long full table scans • Long full table scans for high-volume or interactive online operations could indicate poor transaction design, missing indexes, or poor SQL optimization. • Long table scans, by nature, are I/O intensive and unscalable. Common bottlenecks - Drill down
  49. 49. High amounts of recursive (SYS) SQL • Large amounts of recursive SQL executed by SYS could indicate space management activities, such as extent allocations taking place. • This is unscalable and impacts user response time. • Use locally managed tablespaces to reduce recursive SQL due to extent allocation. • Recursive SQL executed under another user ID is probably SQL and PL/SQL, and this is not a problem. Common bottlenecks - Drill down
  50. 50. High amounts of recursive (SYS) SQL: check yourself You might wonder why I wrote “Use locally managed tablespaces ...". It’s not that dictionary-managed tablespaces are not supported in a database where SYSTEM is locally managed, it’s that they simply can’t be created. If they can’t be created, why would we need to support them? The answer lies in the transportable tablespace feature. You can transport a dictionary-managed tablespace into a database with a SYSTEM tablespace that is locally managed. You can plug that tablespace in and have a dictionary-managed tablespace in your database, but you can’t create one from scratch in that database. Common bottlenecks - Drill down
  51. 51. Deployment and migration errors • In many cases, an application uses too many resources because the schema owning the tables has not been successfully migrated from the development environment or from an older implementation. • Examples of this are missing indexes or incorrect statistics. • These errors can lead to sub-optimal execution plans and poor interactive user performance. Common bottlenecks - Drill down
  52. 52. Deployment and migration errors : good practice • When migrating applications of known performance, export the schema statistics to maintain plan stability using the DBMS_STATS package . • Although these errors are not directly detected by ADDM, ADDM highlights the resulting high load SQL. Common bottlenecks - Drill down
  53. 53. Deployment and migration errors : good practice First identify which databases you want to migrate and applications that access that database. You also evaluate the business requirements and define testing criteria. To determine the requirements of the migration project: • Define the scope of the project. • There are several choices you must make about the applications that access your database in order to define the scope of the migration project. • To obtain a list of migration issues and dependencies, you should consider the following Common bottlenecks - Drill down
  54. 54. Deployment and migration errors : good practice • What are you migrating? • What is the version of the database? • What is the character set of the database? • What source applications are affected by migrating (the third-party database) to an Oracle database? • What is the (third-party) application language? • What version of the application language are you using? In the scope of the project, you should have identified the applications you must migrate. Ensure that you have included all the necessary applications that are affected by migrating the database Common bottlenecks - Drill down
  55. 55. Deployment and migration errors : good practice • What types of connectivity issues are involved in migrating to an Oracle database? • Do you use connectivity software to connect the applications to the (third-party) database? Do you need to modify the connectivity software to connect the applications to the Oracle database? • What version of the connectivity software do you use? Can you use this same version to connect to the Oracle database? • Are you planning to rewrite the applications or modify the applications to work with an Oracle database? Common bottlenecks - Drill down
  56. 56. The Oracle Optimizer:
  57. 57. • Rule Based Optimization (overview) • Cost Based Optimization • The Different Modes of the Cost Based Optimizer • Execution Plans • Data Access Methods • Indexes – Types, Classifications, Advantages & Disadvantages • Sort Usage Guidelines The Oracle Optimizer:
  58. 58. • The optimizer determines the most efficient way to execute a SQL statement after considering many factors related to the objects referenced and the conditions specified in the query. • This determination is an important step in the processing of any SQL statement and can greatly affect execution time. The Oracle Optimizer:
  59. 59. The Oracle Optimizer:SQL Statement Parsing, Overview Syntactic and semantic check Privileges check Allocate private SQL Area Existing shared SQL area? Allocate shared SQL area Execute statement No (Hard parse) Yes (Soft parse) Parse call Parse operation (Optimization) Private SQL area Shared SQL area Parsed representation
  60. 60. The Oracle Optimizer: Why Do You Need an Optimizer? SELECT * FROM emp WHERE job = 'MANAGER '; How can I retrieve these rows? Use the index. Read each row and check. Which one is faster? Query to optimize Only 1% of employees are managers Statistics Schema information Use the index 1 2 3 Possible access paths I have a plan!
  61. 61. The Oracle Optimizer: Why Do You Need an Optimizer? SELECT * FROM emp WHERE job = 'MANAGER '; How can I retrieve these rows? Use the index. Read each row and check. Which one is faster? Query to optimize 80% of employees are managers Statistics Schema information Use Full Table Scan Possible access paths I have a plan! 1 2 3
  62. 62. • Using the RBO, the optimizer chooses an execution plan based on the access paths available and the ranks of these access paths. Oracle's ranking of the access paths is heuristic. If there is more than one way to execute a SQL statement, then the RBO always uses the operation with the lower rank. Usually, operations of lower rank execute faster than those associated with constructs of higher rank. The list shows access paths and their ranking: • RBO Path 1: Single Row by Rowid • RBO Path 2: Single Row by Cluster Join • RBO Path 3: Single Row by Hash Cluster Key with Unique or Primary Key • RBO Path 4: Single Row by Unique or Primary Key • RBO Path 5: Clustered Join • RBO Path 6: Hash Cluster Key • RBO Path 7: Indexed Cluster Key • RBO Path 8: Composite Index • RBO Path 9: Single-Column Indexes • RBO Path 10: Bounded Range Search on Indexed Columns • RBO Path 11: Unbounded Range Search on Indexed Columns • RBO Path 12: Sort Merge Join • RBO Path 13: MAX or MIN of Indexed Column • RBO Path 14: ORDER BY on Indexed Column • RBO Path 15: Full Table Scan rule based optimizer.sql The Oracle Optimizer: Rule Based Optimization (overview)
  63. 63. The CBO performs the following steps: • The optimizer generates a set of potential plans for the SQL statement based on available access paths and hints. • The optimizer estimates the cost of each plan based on statistics in the data dictionary for the data distribution and storage characteristics of the tables, indexes, and partitions accessed by the statement. • The cost is an estimated value proportional to the expected resource use needed to execute the statement with a particular plan. The optimizer calculates the cost of access paths and join orders based on the estimated computer resources, which includes I/O, CPU, and memory. • Serial plans with higher costs take more time to execute than those with smaller costs. When using a parallel plan, however, resource use is not directly related to elapsed time. • The optimizer compares the costs of the plans and chooses the one with the lowest cost. cost based optimizer thru the vers.sql The Oracle Optimizer: Cost Based Optimization
  64. 64. The following features require use of the CBO: • Partitioned tables and indexes • Index-organized tables • Reverse key indexes • Function-based indexes • SAMPLE clauses in a SELECT statement • Parallel query and parallel DML • Star transformations and star joins • Extensible optimizer • Query rewrite with materialized views • Enterprise Manager progress meter • Hash joins • Bitmap indexes and bitmap join indexes • Index skip scans The Oracle Optimizer: Cost Based Optimization
  65. 65. • Piece of code: • Estimator • Plan generator • Estimator determines cost of optimization suggestions made by the plan generator: • Cost: Optimizer’s best estimate of the number of standardized I/Os made to execute a particular statement optimization • Plan generator: • Tries out different statement optimization techniques • Uses the estimator to cost each optimization suggestion • Chooses the best optimization suggestion based on cost • Generates an execution plan for best optimization The Oracle Optimizer: Cost Based Optimization
  66. 66. • Selectivity is the estimated proportion of a row set retrieved by a particular predicate or combination of predicates. • It is expressed as a value between 0.0 and 1.0: • High selectivity: Small proportion of rows • Low selectivity: Big proportion of rows • Selectivity computation: • If no statistics: Use dynamic sampling • If no histograms: Assume even distribution of rows • Statistic information: • DBA_TABLES and DBA_TAB_STATISTICS (NUM_ROWS) • DBA_TAB_COL_STATISTICS (NUM_DISTINCT, DENSITY, HIGH/LOW_VALUE,…) The Oracle Optimizer: Estimator: Selectivity Selectivity = Number of rows satisfying a condition Total number of rows
  67. 67. • Expected number of rows retrieved by a particular operation in the execution plan • Vital figure to determine join, filters, and sort costs • Simple example: • The number of distinct values in DEV_NAME is 203. • The number of rows in COURSES (original cardinality) is 1018. • Selectivity = 1/203 = 4.926*e-03 • Cardinality = (1/203)*1018 = 5.01 (rounded off to 6) cardinality and selectivity in joins.sql Simple cardinality.sql The Oracle Optimizer: Estimator: Cardinality SELECT days FROM courses WHERE dev_name = 'ANGEL;' Cardinality = Selectivity * Total number of rows
  68. 68. The Oracle Optimizer: Estimator: Cost • Cost is the optimizer’s best estimate of the number of standardized I/Os it takes to execute a particular statement. • Cost unit is a standardized single block random read: • 1 cost unit = 1 SRds • The cost formula combines three different costs units into standard cost units. #SRds*sreadtim + #MRds*mreadtim + #CPUCycles/cpuspeed sreadtim Cost= Single block I/O cost Multiblock I/O cost CPU cost #SRds: Number of single block reads #MRds: Number of multiblock reads #CPUCycles: Number of CPU Cycles Sreadtim: Single block read time Mreadtim: Multiblock read time Cpuspeed: Millions instructions per second
  69. 69. The Oracle Optimizer: The Different Modes of the Cost Based Optimizer Value Description CHOOSE The optimizer chooses between a cost-based approach and a rule-based approach, depending on whether statistics are available. This is the default value. •If the data dictionary contains statistics for at least one of the accessed tables, then the optimizer uses a cost-based approach and optimizes with a goal of best throughput. •If the data dictionary contains only some statistics, then the cost-based approach is still used, but the optimizer must guess the statistics for the subjects without any statistics. This can result in suboptimal execution plans. •If the data dictionary contains no statistics for any of the accessed tables, then the optimizer uses a rule-based approach. ALL_ROWS The optimizer uses a cost-based approach for all SQL statements in the session regardless of the presence of statistics and optimizes with a goal of best throughput (minimum resource use to complete the entire statement). FIRST_ROWS_n The optimizer uses a cost-based approach, regardless of the presence of statistics, and optimizes with a goal of best response time to return the first n number of rows; n can equal 1, 10, 100, or 1000. FIRST_ROWS The optimizer uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows. Note: Using heuristics sometimes leads the CBO to generate a plan with a cost that is significantly larger than the cost of a plan without applying the heuristic. FIRST_ROWS is available for backward compatibility and plan stability. RULE The optimizer chooses a rule-based approach for all SQL statements regardless of the presence of statistics. first rows VS all rows.sql
  70. 70. The Oracle Optimizer: What Is an Execution Plan? • The execution plan of a SQL statement is composed of small building blocks called row sources for serial execution plans. • The combination of row sources for a statement is called the execution plan. • By using parent-child relationships, the execution plan can be displayed in a tree-like structure (text or graphical).
  71. 71. The Oracle Optimizer: Where to Find Execution Plans? • PLAN_TABLE (EXPLAIN PLAN or SQL*Plus autotrace) • V$SQL_PLAN (Library Cache) • V$SQL_PLAN_MONITOR (11g) • DBA_HIST_SQL_PLAN (AWR) • STATS$SQL_PLAN (Statspack) • SQL Management Base (SQL Plan Management Baselines) • SQL tuning set • Trace files generated by DBMS_MONITOR • Event 10053 trace file • Process state dump trace file since 10gR2
  72. 72. The Oracle Optimizer: How To Read? SQL> explain plan for 2 select e.empno, e.ename, d.dname 3 from emp e, dept d 4 where e.deptno = d.deptno 5 and e.deptno = 10; Explained. SQL> SELECT * FROM table(dbms_xplan.display(null,null,'basic')); PLAN_TABLE_OUTPUT ------------------------------------------------ Plan hash value: 568005898 ------------------------------------------------ | Id | Operation | Name | ------------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | NESTED LOOPS | | | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | | 3 | INDEX UNIQUE SCAN | PK_DEPT | | 4 | TABLE ACCESS FULL | EMP | ------------------------------------------------ 1. Operation 0 is the root of the tree; it has one child, Operation 1 2. Operation 1 has two children, which is Operation 2 and 4 3. Operation 2 has one child, which is Operation 3
  73. 73. The Oracle Optimizer: How To Read? •Operation 0 (SELECT STATEMENT) | | | Operation 1 (NESTED LOOPS) / / / / / / / / Operation 2 Operation 4 (TABLE ACCESS (TABLE ACCESS FULL) BY INDEX ROWID) | | | Operation 3 (INDEX UNIQUE SCAN) the graphical representation of the execution plan. If you read the tree; In order to perform Operation 1 , you need to perform Operation 2 and 4. Operation 2 comes first; In order to perform 2, you need to perform its Child Operation 3. In order to perform Operation 4, you need to perform Operation 2.
  74. 74. Oracle Supports the below access methods. • Full Table SCAN (FTS) • Table Access by ROW-ID • Index Unique Scan • Index Range Scan • Index Skip Scan • Full Index Scan • Fast Full Index Scans • Index Joins • Hash Access • Cluster Access • Bit Map Index optimizer access paths.sql The Oracle Optimizer: Data Access Methods
  75. 75. Guidelines for Managing Indexes • Create indexes after inserting table data • Index the correct tables and columns • Order index columns for performance • Limit the number of indexes for each table • Drop indexes that are no longer needed • Understand deferred segment creation • Estimate index size and set storage parameters • Specify the tablespace for each index • Consider parallelizing index creation • Consider creating indexes with NOLOGGING • Understand when to use unusable or invisible indexes • Consider costs and benefits of coalescing or rebuilding indexes • Consider cost before disabling or dropping constraints The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages
  76. 76. Index Type Usage The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages Default, balanced tree index, good for high-cardinality (high degree of distinct values) columns B-tree Used with clustered tablesB-tree cluster Used with hash clustersHash cluster Good for columns that have SQL functions applied to themFunction-based Good for columns that have SQL functions applied to them; viable alternative to using a function-based index Indexed virtual column Useful to balance I/O in an index that has many sequential insertsReverse-key Useful for concatenated indexes where the leading column is often repeated; compresses leaf block entries Key-compressed Useful in data warehouse environments with low-cardinality columns; these indexes aren’t appropriate for online transaction processing (OLTP) databases where rows are heavily updated. Bitmap Useful in data warehouse environments for queries that join fact and dimension tables Bitmap join Global index across all partitions in a partitioned tableGlobal partitioned Local index based on individual partitions in a partitioned tableLocal partitioned Specific for an application or cartridgeDomain
  77. 77. Physical layout of a table and B-tree index The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages
  78. 78. The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages When you put indexes on a partitioned table, you have the choice between GLOBAL and LOCAL . The LOCAL index partitions follow the table partitions : They have the same partition key & type, get created automatically when new table partitions are added and get dropped automatically when table partitions are dropped . Beware: LOCAL indexes are usually not appropriate for OLTP access on the table, because one server process may have to scan through many index partitions then . This is the cause of most of the scary performance horror stories you may have heard about partitioning! A GLOBAL index spans all partitions. It has a good SELECT performance usually, but is more sensitive against partition maintenance than LOCAL indexes. The GLOBAL index needs to be rebuilt more often‫ץ‬
  79. 79. The Oracle Optimizer: Optimizer Statistics • Describe the database and the objects in the database • Information used by the query optimizer to estimate: • Selectivity of predicates • Cost of each execution plan • Access method, join order, and join method • CPU and input/output (I/O) costs • Refreshing optimizer statistics whenever they are stale is as important as gathering them: • Automatically gathered by the system • Manually gathered by the user with DBMS_STATS
  80. 80. The Oracle Optimizer: Optimizer Statistics A common misperception that if no new statistics are gathered (and assuming nothing else is altered in the database), that execution plans must always remain the same. That by not collecting statistics, one somehow can ensure and guarantee the database will simply perform in the same manner and generate the same execution plans. This is fundamentally not true. In fact, quite the opposite can be true. One might need to collect fresh statistics to make sure vital execution plans don’t change. It’s the act of not refreshing statistics that can cause execution plans to suddenly change. explain plan changes with no stat change.sql
  81. 81. The Oracle Optimizer: Types of Optimizer Statistics • Table statistics: • Number of rows • Number of blocks • Average row length • Index Statistics: • B*-tree level • Distinct keys • Number of leaf blocks • Clustering factor • System statistics • I/O performance and utilization • CPU performance and utilization • Column statistics • Basic: Number of distinct values, number of nulls, average length, min, max • Histograms (data distribution when the column data is skewed) • Extended statistics
  82. 82. The Oracle Optimizer: Histogrms • The optimizer assumes uniform distributions; this may lead to suboptimal access plans in the case of data skew. • Histograms: • Store additional column distribution information • Give better selectivity estimates in the case of nonuniform distributions • With unlimited resources you could store each different value and the number of rows for that value. • This becomes unmanageable for a large number of distinct values and a different approach is used: • Frequency histogram (#distinct values ≤ #buckets) • Height-balanced histogram (#buckets < #distinct values) • They are stored in DBA_TAB_HISTOGRAMS.
  83. 83. The Oracle Optimizer: Frequency Histograms 10 buckets, 10 distinct values 0 10000 20000 30000 40000 1 3 5 7 10 16 27 32 39 49 ENDPOINT VALUE: Column value ENDPOINT NUMBER Cumulative cardinality # rows for column value Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49 Number of rows: 40001
  84. 84. The Oracle Optimizer: Height-Balanced Histograms 5 buckets, 10 distinct values (8000 rows per bucket) 0 1 3 4 5 ENDPOINT NUMBER: Bucket number ENDPOINT VALUE 2 Same number of rows per bucket 1 7 10 10 32 49 Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49 Number of rows: 40001 Popular value
  85. 85. The Oracle Optimizer: Height-Balanced Histograms In a height-balanced histogram, the ordered column values are divided into bands so that each band contains approximately the same number of rows. The histogram tells you values of the endpoints of each band. In the example in the slide, assume that you have a column that is populated with 40,001 numbers. There will be 8,000 values in each band. You only have ten distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, and 49. Value 10 is the most popular value with 16,293 occurrences. When the number of buckets is less than the number of distinct values, ENDPOINT_NUMBER records the bucket number and ENDPOINT_VALUE records the column value that corresponds to this endpoint. HISTOGRAMS .sql
  86. 86. Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
  87. 87. Buffer cache For many types of operations, Oracle Database uses the buffer cache to store data blocks read from disk. Oracle Database bypasses the buffer cache for particular operations, such as sorting and parallel reads. To use the database buffer cache effectively, tune SQL statements for the application to avoid unnecessary resource consumption. To meet this goal, verify that frequently executed SQL statements and SQL statements that perform many buffer gets are well-tuned. When configuring a new database instance, it is impossible to know the correct size for the buffer cache. Typically, a database administrator makes a first estimate for the cache size, then runs a representative workload on the instance and examines the relevant statistics to see whether the cache is under-configured or over-configured. Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
  88. 88. What is a Physical I/O?? Whenever you execute a query, Oracle has to go and fetch data to give you the result of the query execution. Here, data means the actual data in data blocks. Whenever a new data block is requested, it has to be fetched from the physical datafiles residing on the physical disks. This fetching of data blocks from the physical disk involves an I/O operation known as physical I/O. By virtue of this physical I/O, now the block has been fetched and read into the memory area called buffer cache. This is a default action. We know that a data block might be requested multiple times by multiple queries. Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
  89. 89. What is a Logical I/O?? Once a physical I/O has taken place and the block has been read into the memory, the next request for the same data block wont require the block to be fetched from the disk and hence avoiding a physical I/O. So now to return the results for the select query requesting the same data block, the block will be fetched from the memory and is called a Logical I/O. Whenever the quantum of Logical I/O is calculated, two kinds of reads are considered : Consistent reads and Current reads. Jointly, these 2 statistics are known as Logical I/O performed by Oracle. Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
  90. 90. Consistent reads It is a well known fact that whenever a change is induced in a data block, the old data/entry is written to the UNDO/ROLLBACK segments. From the fundamentals of UNDO, we also know that this is to provide a read consistent view of the data block to other users trying to read the same data block. Consistent reads mean reading the block in a consistent mode “point in time”. Here the phrase “point in time” means the time when the query/statement began. A consistent read might or might not involve any UNDO data. UNDO data will be applied when it is necessary to roll back a data block to the required “point in time” when the SQL statement was fired. If on reading the buffer cache, it is found that the data block is already in the required state, no UNDO data is required because the block is already consistent. Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
  91. 91. Consistent reads and array size Consistent reads could also depend on and vary with the array size setting of SQLPLUS. The default value is 15. Array size is the number of rows fetched in a single read. The value of array size is an indicator of the number of network round trips made to fetch the required data from Oracle. A careful adjustment of array size value can improve performance by reducing the network round trips. A higher array size might be good for performance of queries (by reducing the network round trips and also the consistent reads) but too high value also uses more memory. However, array size is not a setting restricted to SQLPLUS; it can be set in many other applications requesting data from oracle database. Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
  92. 92. How do you identify the source of the problem?
  93. 93. Solving database performance issues sometimes requires the use of operating system (OS) utilities. These tools often provide information that can help isolate database performance problems. Consider the following situations: • You’re running multiple databases and multiple applications on one server and want to use OS utilities to identify which database (and corresponding process) is consuming the most operating system resources. This approach is invaluable when one database application is consuming resources to the point of causing other databases on the box to perform poorly. • You need to verify if the database server is adequately sized for current application workload in terms of CPU, memory, disk I/O, and network bandwidth. An analysis is needed to determine at what point the server will not be able to handle larger (future) workloads. • You’ve used database tools to identify system bottlenecks and want to double-check the analysis via operating system tools. How do you identify the source of the problem?
  94. 94. In these scenarios, to effectively analyze, tune, and troubleshoot, you’ll need to employ OS tools to identify resource-intensive processes. Furthermore, if you have multiple databases and applications running on one server, when troubleshooting performance issues, it’s often more efficient to first determine which database and process is consuming the most resources. Operating system utilities help pinpoint whether the bottleneck is CPU, memory, disk I/O, or a network issue. In Linux/Unix environments, once you have the operating system identifier, you can then query the database to show any corresponding database processes and SQL statements. How do you identify the source of the problem?
  95. 95. Solutions: where do you start and what order to work?
  96. 96. Solutions: where do you start and what order to work?
  97. 97. Mapping a Resource-Intensive Process to a Database Process Problem It’s a dark and stormy night, and the system is performing poorly. You identify an operating system–intensive process on the box. You want to map an operating system process back to a database process. If the database process is a SQL process, you want to display the user of the SQL statement and also the SQL. Solution In Linux/Unix environments, if you can identify the resource-intensive operating system process, then you can easily check to see if that process is associated with a database process. The process consists of the following: 1. Run an OS command to identify resource-intensive processes and associated IDs. 2. Identify the database associated with the process. 3. Extract details about the process from the database data dictionary views. 4. If it’s a SQL statement, get those details. 5. Generate an execution plan for the SQL statement. Solutions: where do you start and what order to work?
  98. 98. Introduction to SQL and Application Tuning
  99. 99. Proactive Tuning Methodology • Simple design • Data modeling • Tables and indexes • Using views • Writing efficient SQL • Cursor sharing • Using bind variables Introduction to SQL and Application Tuning
  100. 100. Simplicity in Application Design • Simple tables • Well-written SQL • Indexing only as required • Retrieving only required information Introduction to SQL and Application Tuning
  101. 101. Data Modeling • Accurately represent business practices • Focus on the most frequent and important business transactions • Use modeling tools • Appropriately normalize data (OLTP versus DW) Introduction to SQL and Application Tuning
  102. 102. Table Design • Compromise between flexibility and performance: • Principally normalize • Selectively denormalize • Use Oracle performance and management features: • Default values • Constraints • Materialized views • Clusters • Partitioning • Focus on business-critical tables Introduction to SQL and Application Tuning
  103. 103. Index Design • Create indexes on the following: • Primary key (automatically created) • Unique key (automatically created) • Foreign keys (good candidates) • Index data that is frequently queried (select list). • Use SQL as a guide to index design. Introduction to SQL and Application Tuning
  104. 104. Using Views • Simplifies application design • Is transparent to the developer • Can cause suboptimal execution plans Introduction to SQL and Application Tuning
  105. 105. SQL Execution Efficiency • Good database connectivity • Minimizing parsing • Share cursors • Using bind variables Introduction to SQL and Application Tuning
  106. 106. Writing SQL to Share Cursors • Create generic code using the following: • Stored procedures and packages • Database triggers • Any other library routines and procedures • Write to format standards (improves readability): • Case • White space • Comments • Object references • Bind variables Introduction to SQL and Application Tuning
  107. 107. Performance Checklist • Set initialization parameters and storage options. • Verify resource usage of SQL statements. • Validate connections by middleware. • Verify cursor sharing. • Validate migration of all required objects. • Verify validity and availability of optimizer statistics. Introduction to SQL and Application Tuning
  108. 108. When and What to Tune?
  109. 109. • Clustering factor • Integrity Constrains are Important • Reasons for Inefficient SQL Performance • Using Bind Variables • Restructuring SQL Statements • Shared SQL and Cursors When and What to Tune?
  110. 110. When and What to Tune? The Clustering Factor The clustering factor is a number which represent the degree to which data is randomly distributed in a table. In simple terms it is the number of “block switches” while reading a table using an index.
  111. 111. When and What to Tune? The above diagram explains that how scatter the rows of the table are. The first index entry (from left of index) points to the first data block and second index entry points to second data block. So while making index range scan or full index scan, optimizer have to switch between blocks and have to revisit the same block more than once because rows are scatter. So the number of times optimizer will make these switches is actually termed as“Clustering factor”.
  112. 112. When and What to Tune? The above image represents "Good CF”. In an event of index range scan, optimizer will not have to jump to next data block as most of the index entries points to same data block. This helps significantly in reducing the cost of your SELECT statements. Clustering factor is stored in data dictionary and can be viewed from dba_indexes (or user_indexes) Clustering factor.sql index fragmentation impact on performance.sql
  113. 113. Integrity Constrains are Important Many people think of constraints as a data integrity thing, and it’s true— they are. But constraints are used by the optimizer as well when determining the optimal execution plan. The optimizer takes as inputs • The query to optimize • All available database object statistics • System statistics, if available (CPU speed, single-block I/O speed, and so on—metrics about the physical hardware) • Initialization parameters • Constraints null columns differ from not nul.sql fk adds to query performance When and What to Tune?
  114. 114. • Reasons for inefficient SQL performance • Stale or missing optimizer statistics • Missing access structures • Suboptimal execution plan selection • Poorly constructed SQL When and What to Tune?
  115. 115. When and What to Tune? Richard Morris: Are there issues that crop up again and again? Tom Kyte: Perhaps the biggest issue is the black box approach of development. A developer will learn everything they can about the procedural language they're using. However, they don't learn about the database that they're using or other packages that might be involved…… Richard Morris: Do you think then that poor education is to blame? That somehow it’s got worse over the years rather than getting better? Tom Kyte: No, it hasn’t changed. When I get up on stage at a seminar and I talk about bind variables I start by saying that for 16 years I’ve been talking about the same thing but each year the problem is the same. Why? Because universities are trying to teach students theory and algorithms and things like that, they’re not teaching them how to write production quality code. They don’t teach them how to debug or how to instrument, they don’t teach them how to defensively program. They just teach them how to write a compiler in Lisp which frankly doesn’t translate very well into IT.
  116. 116. Using Bind Variables Oracle automatically notices when applications send similar SQL statements to the database. The SQL area used to process the first occurrence of the statement is shared- that is, used for processing subsequent occurrences of that same statement. Therefore, only one shared SQL area exists for a unique statement. Because shared SQL areas are shared memory areas, any Oracle process can use a shared SQL area. The sharing of SQL areas reduces memory use on the database server, thereby increasing system throughput. In evaluating whether statements are similar or identical, Oracle considers SQL statements issued directly by users and applications as well as recursive SQL statements issued internally by a DDL statement. One of the first stages of parsing is to compare the text of the statement with existing statements in the shared pool to see if the statement can be shared. If the statement differs textually in any way, then Oracle does not share the statement. Exceptions to this are possible when the parameter CURSOR_SHARING has been set to SIMILAR or FORCE. cursor sharing.sql When and What to Tune?
  117. 117. ADAPTIVE BINDING DBAs are always encouraging developers to use bind variables, but when bind variables are used against columns containing skewed data they sometimes lead to less than optimum execution plans. This is because the optimizer peeks at the bind variable value during the hard parse of the statement, so the value of a bind variable when the statement is first presented to the server can affect every execution of the statement, regardless of the bind variable values. Oracle uses Adaptive Cursor Sharing to solve this problem by allowing the server to compare the effectiveness of execution plans between executions with different bind variable values. If it notices suboptimal plans, it allows certain bind variable values, or ranges of values, to use alternate execution plans for the same statement. This functionality requires no additional configuration. When and What to Tune?
  118. 118. • Restructuring SQL Statements reconstruct sql queries.sql When and What to Tune? SELECT COUNT(*) FROM products p WHERE prod_list_price < 1.15 * (SELECT avg(unit_cost) FROM costs c WHERE c.prod_id = p.prod_id) SELECT * FROM job_history jh, employees e WHERE substr(to_char(e.employee_id),2) = substr(to_char(jh.employee_id),2) SELECT * FROM orders WHERE order_id_char = 1205 SELECT * FROM employees WHERE to_char(salary) = :sal 1 2 3 4 SELECT * FROM parts_old UNION SELECT * FROM parts_new5
  119. 119. Various sql and pl/sql techniques to improve performance Advanced SQL and Application Topics