2. •38 years old
* Married + 3
* 15 years AS a dba, consultant, instructor, architect.
* CEO @ DBcs ltd.
* Was cto @ johnbryce israel
* Oracle certified professional
* Microsoft sql server certified professional
• Aaron@dbcs.co.il
• www.dbcs.co.il
About Me :
3. Agenda
• Oracle Database Architecture Overview
• The connection between SQL tuning & Instance tuning
• The connection between database & operating system
• Common bottlenecks - Drill down
• How do you identify the source of the problem?
• Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
• Solutions: where do you start and what order to work?
• Introduction to SQL and Application Tuning
• The Oracle Optimizer:
• Rule Based Optimization (overview)
• Cost Based Optimization
• The Different Modes of the Cost Based Optimizer
• Execution Plans
• Data Access Methods
• Indexes – Types, Classifications, Advantages & Disadvantages
• Sort Usage Guidelines
• When and What to Tune?
• Clustering factor
• Data Types are Important
• Integrity Constrains are Important
• Reasons for Inefficient SQL Performance
• Using Bind Variables
• Restructuring SQL Statements
• Shared SQL and Cursors
• Advanced SQL and Application Topics
4. "You have to be constantly
evolving and in some cases
DBAs/Programmers don’t do that because
they know how they did it
years ago and they want to
keep doing it that way..."
5. Quote from Thomas Kyte's book
if you want a 10 step guide to tuning a query, buy a piece of software. You are not needed in this process,
anyone can put a query in, get a query out and run it to see if it is faster. There are tons of these tools on
the market. They work using rules (heuristics) and can tune maybe 1% of the problem queries out
there. They APPEAR to be able to tune a much larger percent but that is only because the people using
these tools never look at the outcome -- hence they continue to make the same basic mistakes over and
over and over.
If you want to really be able to tune the other 99% of the queries out there, knowledge of lots of
stuff -- physical storage mechanisms, access paths, how the optimizer works -that's the only way.
..
..
Think about it for a moment. If there were a 10 step or even 1,000,000 step process by which any query
can be tuned (or even X% of queries for that matter), we would write a program to do it. Oh don't get me
wrong, there are many programs that actually try to do this - Oracle Enterprise Manager with its tuning
pack, SQL Navigator and others. What they do is primarily recommend indexing schemes to tune a query,
suggest materialized views, offer to add hints to the query to try other access plans. They show you
different query plans for the same statement and allow you to pick one. They offer "rules of thumb" (what I
generally call ROT since the acronym and the word is maps to are so appropriate for each other) SQL
optimizations - which if they were universally applicable - the optimizer would do it as a matter of
fact. In fact, the cost based optimizer does that already - it rewrites our queries all of the time. These
tuning tools use a very limited set of rules that sometimes can suggest that index or set of indexes
you really should have thought of during your design.
8. Oracle Database Memory Structures: Overview
Background
process
Server
process
Server
process
SGA
Redo log
buffer
Database buffer
cache
Java pool Streams
pool
Shared pool Large pool
Aggregated
PGA
… …
…
9. Database Buffer Cache
• Is a part of the SGA
• Holds copies of data blocks that are read from data files
• Is shared by all concurrent processes
Database writer
process
Database
buffer
cache
SGA
Data files
DBWn
Server
process
10. Redo Log Buffer
• Is a circular buffer in the SGA (based on the number of CPUs)
• Contains redo entries that have the information to redo changes
made by operations, such as DML and DDL
Log writer process
Redo log
buffer
SGA
Redo log
files
LGWR
Server
process
11. SGA
Shared Pool
• Is part of the SGA
Contains:
• Library cache
• Shared parts of SQL and
PL/SQL statements
• Data dictionary cache
• Result cache:
• SQL queries
• PL/SQL functions
• Control structures
• Locks
Library
cache
Data
dictionary
cache
(row cache)
Control structures
Result
cache
Server
process
Shared pool
12. Processing a DML Statement: Example
Database
Data
files
Control
files
Redo
log files
User
process
Shared pool
Redo log
buffer
Server
process 3
5
1 Library cache
2
4
Database
buffer cache
DBWn
SGA
2
14. Program Global Area (PGA)
PGA is a memory area that contains:
• Session information
• Cursor information
• SQL execution work areas:
• Sort area
• Hash join area
• Bitmap merge area
• Bitmap create area
• Work area size influences SQL performance.
• Work areas can be automatically or manually managed.
Stack
Space
User Global Area (UGA)
User
Session
Data
Cursor
Status
SQL
Area
Server
process
17. Automated SQL Execution Memory Management
Background
process
Server
process
Server
process
… …
…
PGA_AGGREGATE_TARGET
Which size to choose?
Aggregated
PGA
18. Automatic Memory Management
• Sizing of each memory component is vital for SQL execution
performance.
• It is difficult to manually size each component.
• Automatic memory management automates memory allocation of
each SGA component and aggregated PGA.
Buffercache
Largepool
Sharedpool
Javapool
Streamspool
Private
SQLareas
OtherSGA
Untunable
PGA
Free
MEMORY_TARGET+STATISTICS_LEVEL
MMAN
20. Database tuning is the process of tuning the actual database, which
encompasses the allocated memory, disk usage, CPU, I/O, and underlying
database processes.
Tuning a database also involves the management and manipulation of the
database structure itself, such as the design and layout of tables and indexes.
Additionally, database tuning often involves the modification of the database
architecture in order to optimize the use of the hardware resources available.
There are many other considerations when tuning a database, but these tasks
are normally accomplished by the database administrator.
The objective of database tuning is to ensure that the database has been
designed in a way that best accommodates expected activity within the
database.
The connection between SQL tuning & Instance
tuning
21. SQL tuning is the process of tuning the SQL statements that access
the database.
These SQL statements include database queries and transactional
operations such as inserts, updates, and deletes.
The objective of SQL statement tuning is to formulate statements
that most effectively access the database in its current state, taking
advantage of database and system resources and indexes.
The connection between SQL tuning & Instance
tuning
22. Both database tuning and SQL statement tuning must be performed to achieve
optimal results when accessing the database.
A poorly tuned database may very well render wasted effort in SQL tuning, and
vice versa.
Ideally, it is best to first tune the database, and then ensure that indexes exist
where needed, and then tune the SQL code.
The connection between SQL tuning & Instance
tuning
24. Question: We are in the process of adopting Oracle and we have many choices
of operating system platforms. Which OS is best for Oracle and how to I compare
operating system environments for Oracle databases?
Answer: That's a very common question. Oracle dominates the database world in
part because it runs on over 60 platforms, everything from a Mainframe to a Mac.
Oracle chose Solaris as their preferred OS in 2005, and later decided to work on
their own Linux distro, making a Oracle Linux OS that is custom-tailored to the
needs of a typical database. Oracle leverages on the advantages of all OS
platforms with an independent OSI, customized to each platform.
As to which UNIX dialect is "best", it's often related to the server environment. For
example, svmon is only available on IBM AIX. . . .
Some, operating systems are better at managing large volumes of data, such as
SUSE, who developed a special kernel, just for Oracle
The connection between database & operating
system
25. Data integrity features (T10 Protection Information )
Protection Information enables applications or kernel subsystems to attach
metadata to I/O operations, allowing devices that support PI to verify the
integrity before passing them further down the stack and physically
committing them to disk.
Data Integrity Extensions or DIX is a hardware feature that enables
exchange of protection metadata between host operating system and HBA
and helps to avoid corrupt data from being written, allowing a full
end-to-end data integrity check.
The connection between database & operating
system
26. Zero downtime updates
Make updates to the Linux Operating System (OS) kernel, while it is running,
without a reboot or any interruption.
Only Oracle Linux offers this unique capability, making it possible to keep up with
important Linux kernel updates without burdening you with the operational cost
and disruption of rebooting for every update to the
kernel.
Ksplice allows system administrators to deliver valuable patches for both the
Unbreakable Enterprise Kernel as well as the Red Hat compatible kernel with
lower costs, less downtime, increased security, and greater flexibility and control.
The connection between database & operating
system
27. Btrfs
File System Btrfs (B-tree file system) is the “next generation file system” for Linux.
Pronounced as “Butter FS” or “B-tree FS”, it is a GPL licensed file system first developed by
Oracle’s Chris Mason in 2007.
Btrfs provides a number of features that make it a very attractive file system solution for
local disk storage.
Btrfs is designed for:
• Large files and file systems from the ground up
• Simplified administration
• Integrated RAID and volume management
• Snapshots
• Checksums for data and meta-data
The connection between database & operating
system
29. Not every suggestion is a good suggestion.”
Even if its from the Software provider himself .”
Aaron Shilo
Common bottlenecks - Drill down
Once upon a time, Oracle Support had a note called Script: Lists All Indexes that Benefit from a
Rebuild (Doc ID 122008.1) which lets just say I didn’t view in a particularly positive light :-) Mainly
because it gave dubious advice which included that indexes should be rebuilt if:
Deleted entries represent 20% or more of current entries
The index depth is more than 4 levels
It then detailed a script that ran a Validate Structure across all indexes in the database that didn’t belong
in either the SYS or SYSTEM schema.
This script basically read through and sequentially locked all tables (maybe multiple times) in the
database in order to list indexes that might not actually need a rebuild while potentially missing out on
some that do. I could write a script that achieved the same result with far less overheads. For example,
SELECT index_name FROM DBA_INDEXES where index_name like ‘A%’ and owner not in (‘SYS’,
‘SYSTEM’) would achieve a very similar result
Posted by Richard Foote in Doc 122008.1, Doc 989093.1, Index Rebuild, Oracle Indexes
30. Bad connection management
• The application connects and disconnects for each database
interaction.
• This problem is common with stateless middleware in application
servers.
• It has over two orders of magnitude impact on performance, and is
totally unscalable.
Common bottlenecks - Drill down
31. Bad connection management solution.
• Turn on Connection Pooling
• Database Resident Connection Pool (DRCP) in Oracle Database 11g
• Fallback when there is no application tier connection pooling
• Also useful for sharing connections across middle tier hosts
• Supported only for OCI and PHP applications
• Scales to tens of thousands of database connections even on a
commodity box • Enable with dbms_connection_pool.start_pool
• Connect String
• Easy Connect: //localhost:1521/oowlab:POOLED
• TNS Connect String: (SERVER=POOLED)
Common bottlenecks - Drill down
32. Bad use of cursors and the shared pool
• Not using cursors results in repeated parses.
• If bind variables are not used, then there is hard parsing of all SQL
statements.
• This has an order of magnitude impact in performance, and it is totally
unscalable.
• Use cursors with bind variables that open the cursor and execute it
many times.
• Be suspicious of applications generating dynamic SQL.
Common bottlenecks - Drill down
33. Bad use of cursors and the shared pool solution
Bind Variables in Java
• Instead of:
String query = "SELECT EMPLOYEE_ID, LAST_NAME, SALARY FROM "
+"EMPLOYEES WHERE EMPLOYEE_ID = "
+ generateNumber(MIN_EMPLOYEE_ID,MAX_EMPLOYEE_ID);
pstmt = connection.prepareStatement(query); rs = pstmt.executeQuery();
• Change to:
String query = "SELECT EMPLOYEE_ID, LAST_NAME, SALARY FROM "
+"EMPLOYEES WHERE EMPLOYEE_ID = ?";
pstmt = connection.prepareStatement(query); pstmt.setInt(1, n); rs = pstmt.executeQuery();
Common bottlenecks - Drill down
34. Bad use of cursors and the shared pool solution
Bind Variables in OCI
static char *MY_SELECT = "select employee_id, last_name, salary from
employees where employee_id = :EMPNO";
OCIBind *bndp1;
OCIStmt *stmthp;
ub4 emp_id;
OCIStmtPrepare2 (svchp, &stmthp, /* returned stmt handle */
errhp, /* error handle */
(const OraText *) MY_SELECT,
strlen((char *) MY_SELECT),
NULL, 0, /* tagging parameters:optional */
OCI_NTV_SYNTAX, OCI_DEFAULT);
/* bind input parameters */
OCIBindByName(stmthp, &bndp1, errhp, (text *) ":EMPNO",
-1, &(emp_id), sizeof(emp_id), SQLT_INT,
NULL, NULL, NULL, 0, NULL, OCI_DEFAULT);
Common bottlenecks - Drill down
35. Bad SQL
• Bad SQL is SQL that uses more resources than appropriate for the
application requirement.
• This can be a decision support systems (DSS) query that runs for
more than 24 hours, or a query from an online application that
takes more than a minute.
• You should investigate SQL that consumes significant system
resources for potential improvement.
• ADDM identifies high load SQL.
• SQL Tuning Advisor can provide recommendations for
improvement.
Common bottlenecks - Drill down
36. Bad SQL partial solution
SQL Tuning Advisor is SQL diagnostic software
in the Oracle Database Tuning Pack.
You can submit one or more SQL statements as
input to the advisor and receive advice or
recommendations for how to tune the statements,
along with a rationale and expected benefit.
sql tuning advisor.sql
Common bottlenecks - Drill down
37. Bad SQL partial solution
The SQL Access Advisor can automatically analyze
the schema design for a given workload and recommend indexes, function-based indexes,
partitions, and materialized views to create, retain,or drop as appropriate for the workload.
For single statement scenarios, the advisor only recommends adjustments that affect
the current statement.
For complete business workloads,
the advisor makes recommendations
after considering the impact on the
entire workload.
sql access advisor.sql
Common bottlenecks - Drill down
38. Use of nonstandard initialization parameters
• These might have been implemented based on poor advice or
incorrect assumptions.
• Most databases provide acceptable performance using only the set
of basic parameters.
• In particular, parameters associated with SPIN_COUNT on latches
and undocumented optimizer features can cause a great deal of
problems that can require considerable investigation.
• Likewise, optimizer parameters set in the initialization parameter file
can override proven optimal execution plans
• For these reasons, schemas, schema statistics, and optimizer
settings should be managed as a group to ensure consistency of
performance.
Common bottlenecks - Drill down
39. So here’s a little Oracle Fudge for you!
12c has parameters with the word fudge in them.
_nested_loop_fudge
_parallelism_cost_fudge_factor
_px_broadcast_fudge_factor
_query_rewrite_fudge
These “fudge” parameters have been around with the same default values since at least
8.1.7.
Common bottlenecks - Drill down
40. Use of nonstandard initialization parameters: Know your stuff!!!
• Oracle takes this issue seriously: OEM policy checks this in fact
• Use of Non-Standard Initialization Parameters
• This policy checks for use of non-standard initialization parameters.
• I've encountered a few during my work on prod systems,
here are two examples.
• db_file_multiblock_read_count
• optimizer_index_caching
• optimizer_index_cost_adj
non standard parameters.sql
Common bottlenecks - Drill down
41. Getting database I/O wrong
• Many sites lay out their databases poorly over the available disks.
• Other sites specify the number of disks incorrectly, because they
configure disks by disk space and not I/O bandwidth.
Common bottlenecks - Drill down
42. Getting database I/O wrong : general solutions
Stripe Everything Across Every Disk
The simplest approach to I/O configuration is to build one giant volume, striped
across all available disks.
To account for recoverability, the volume is mirrored (RAID 1).
The striping unit for each disk should be larger than the maximum I/O size for the
frequent I/O operations.
This provides adequate performance for most cases.
https://docs.oracle.com/cd/B19306_01/server.102/b14211/iodesign.htm#i35235
Common bottlenecks - Drill down
43. Getting database I/O wrong : general solutions
https://docs.oracle.com/cd/B19306_01/server.102/b14211/iodesign.htm#i35235
Common bottlenecks - Drill down
AdvantagesDisadvantagesBlock Size
Good for small rows with lots of random access.Has relatively large space overhead due to metadata
(that is, block header).
Smaller
Reduces block contention.
Not recommended for large rows. There might only
be a few rows stored for each block, or worse, row
chaining if a single row does not fit into a block,
Has lower overhead, so there is more room to store
data.Wastes space in the buffer cache, if you are doing
random access to small rows and have a large block
size. For example, with an 8 KB block size and 50 byte
row size, you waste 7,950 bytes in the buffer cache
when doing random access.
Larger
Permits reading a number of rows into the buffer cache
with a single I/O (depending on row size and block size).
Not good for index blocks used in an OLTP
environment, because they increase block
contention on the index leaf blocks.
Good for sequential access or very large rows (such as
LOB data).
44. Online redo log setup problems
• Many sites run with too few online redo log files and files that
are too small.
• Small redo log files cause system checkpoints to continuously
put a high load on the buffer cache and I/O system.
• If too few redo log files exist, then the archive cannot keep up,
and the database must wait for the archiver to catch up.
Common bottlenecks - Drill down
All online redo log files should be the same size and configured to switch approximately once an hour during normal activity. They
should switch no more frequently than every 20 minutes during peak activity.
There should be a minimum of four online log groups to prevent LGWR from waiting for a group to be available following a log
switch. A group may be unavailable because a checkpoint has not yet completed or the group has not yet been archived.
http://docs.oracle.com/cd/B12037_01/server.101/b10726/configbp.htm#1006950
45. Serialization
• Serialization of data blocks in the buffer cache due to lack of free
lists, free list groups, transaction slots (INITRANS), or shortage of
rollback segments.
• This is particularly common on INSERT-heavy applications, in
applications that have raised the block size above 8K, or in
applications with large numbers of active users and few rollback
segments.
• Use automatic segment-space management (ASSM) and
automatic undo management to solve this problem.
Common bottlenecks - Drill down
46. Serialization : Easy solution
Oracle records a 'latch miss' when a process must wait for a latch to become
available, and we also see 'buffer busy waits' when a process must wait for a
freelist.
You can reduce buffer busy waits by adding additional FREELISTS or
FREELIST GROUPS.
For low-update databases you can also implement bitmap freelists (ASSM,
Automatic Segment Storage Management) with the create tablespace clause
'segment space management auto'.
Common bottlenecks - Drill down
47. Serialization of another kind….
A look into Transaction Serialization
“To describe consistent transaction behavior when transactions run
concurrently, database researchers have defined a transaction isolation
model called serializability.
A serializable transaction operates in an environment that makes it
appear as if no other users were modifying data in the database.”
Common bottlenecks - Drill down
48. Long full table scans
• Long full table scans for high-volume or interactive online operations
could indicate poor transaction design, missing indexes, or poor
SQL optimization.
• Long table scans, by nature, are I/O intensive and unscalable.
Common bottlenecks - Drill down
49. High amounts of recursive (SYS) SQL
• Large amounts of recursive SQL executed by SYS could indicate space
management activities, such as extent allocations taking place.
• This is unscalable and impacts user response time.
• Use locally managed tablespaces to reduce recursive SQL due to
extent allocation.
• Recursive SQL executed under another user ID is probably SQL and
PL/SQL, and this is not a problem.
Common bottlenecks - Drill down
50. High amounts of recursive (SYS) SQL: check yourself
You might wonder why I wrote “Use locally managed tablespaces ...".
It’s not that dictionary-managed tablespaces are not supported in a
database where SYSTEM is locally managed, it’s that they simply can’t be
created.
If they can’t be created, why would we need to support them?
The answer lies in the transportable tablespace feature.
You can transport a dictionary-managed tablespace into a database with a
SYSTEM tablespace that is locally managed.
You can plug that tablespace in and have a dictionary-managed tablespace in
your database, but you can’t create one from
scratch in that database.
Common bottlenecks - Drill down
51. Deployment and migration errors
• In many cases, an application uses too many resources because the
schema owning the tables has not been successfully migrated from
the development environment or from an older implementation.
• Examples of this are missing indexes or incorrect statistics.
• These errors can lead to sub-optimal execution plans and poor
interactive user performance.
Common bottlenecks - Drill down
52. Deployment and migration errors : good practice
• When migrating applications of known performance, export the
schema statistics to maintain plan stability using the DBMS_STATS
package .
• Although these errors are not directly detected by ADDM, ADDM
highlights the resulting high load SQL.
Common bottlenecks - Drill down
53. Deployment and migration errors : good practice
First identify which databases you want to migrate and applications that access
that database. You also evaluate the business requirements and define testing
criteria.
To determine the requirements of the migration project:
• Define the scope of the project.
• There are several choices you must make about the applications that
access your database in order to define the scope of the migration
project.
• To obtain a list of migration issues and dependencies, you should
consider the following
Common bottlenecks - Drill down
54. Deployment and migration errors : good practice
• What are you migrating?
• What is the version of the database?
• What is the character set of the database?
• What source applications are affected by migrating (the third-party database) to
an Oracle database?
• What is the (third-party) application language?
• What version of the application language are you using?
In the scope of the project, you should have identified the applications you must
migrate. Ensure that you have included all the necessary applications that are
affected by migrating the database
Common bottlenecks - Drill down
55. Deployment and migration errors : good practice
• What types of connectivity issues are involved in migrating to an Oracle
database?
• Do you use connectivity software to connect the applications to the
(third-party) database? Do you need to modify the connectivity software
to connect the applications to the Oracle database?
• What version of the connectivity software do you use? Can you use this
same version to connect to the Oracle database?
• Are you planning to rewrite the applications or modify the applications to work
with an Oracle database?
Common bottlenecks - Drill down
57. • Rule Based Optimization (overview)
• Cost Based Optimization
• The Different Modes of the Cost Based
Optimizer
• Execution Plans
• Data Access Methods
• Indexes – Types, Classifications, Advantages &
Disadvantages
• Sort Usage Guidelines
The Oracle Optimizer:
58. • The optimizer determines the most efficient way to execute a SQL
statement after considering many factors related to the objects
referenced and the conditions specified in the query.
• This determination is an important step in the processing of any
SQL statement and can greatly affect execution time.
The Oracle Optimizer:
59. The Oracle Optimizer:SQL Statement Parsing, Overview
Syntactic and semantic check
Privileges check
Allocate private SQL Area
Existing shared
SQL area?
Allocate shared SQL area
Execute statement
No
(Hard parse)
Yes (Soft parse)
Parse
call
Parse operation
(Optimization)
Private
SQL area
Shared
SQL area
Parsed representation
60. The Oracle Optimizer: Why Do You Need an Optimizer?
SELECT * FROM emp WHERE job = 'MANAGER ';
How can I retrieve these rows?
Use the
index.
Read
each row
and check.
Which one is faster?
Query to optimize
Only 1% of employees are managers
Statistics
Schema
information
Use the
index
1
2
3
Possible access paths
I have a plan!
61. The Oracle Optimizer: Why Do You Need an Optimizer?
SELECT * FROM emp WHERE job = 'MANAGER ';
How can I retrieve these rows?
Use the
index.
Read
each row
and check.
Which one is faster?
Query to optimize
80% of employees are managers
Statistics
Schema
information
Use Full
Table Scan
Possible access paths
I have a plan!
1
2
3
62. • Using the RBO, the optimizer chooses an execution plan based on the access paths
available and the ranks of these access paths. Oracle's ranking of the access paths is
heuristic. If there is more than one way to execute a SQL statement, then the RBO
always uses the operation with the lower rank. Usually, operations of lower rank execute
faster than those associated with constructs of higher rank.
The list shows access paths and their ranking:
• RBO Path 1: Single Row by Rowid
• RBO Path 2: Single Row by Cluster Join
• RBO Path 3: Single Row by Hash Cluster Key with Unique or Primary Key
• RBO Path 4: Single Row by Unique or Primary Key
• RBO Path 5: Clustered Join
• RBO Path 6: Hash Cluster Key
• RBO Path 7: Indexed Cluster Key
• RBO Path 8: Composite Index
• RBO Path 9: Single-Column Indexes
• RBO Path 10: Bounded Range Search on Indexed Columns
• RBO Path 11: Unbounded Range Search on Indexed Columns
• RBO Path 12: Sort Merge Join
• RBO Path 13: MAX or MIN of Indexed Column
• RBO Path 14: ORDER BY on Indexed Column
• RBO Path 15: Full Table Scan
rule based optimizer.sql
The Oracle Optimizer: Rule Based Optimization (overview)
63. The CBO performs the following steps:
• The optimizer generates a set of potential plans for the SQL statement based on
available access paths and hints.
• The optimizer estimates the cost of each plan based on statistics in the data
dictionary for the data distribution and storage characteristics of the tables,
indexes, and partitions accessed by the statement.
• The cost is an estimated value proportional to the expected resource use needed
to execute the statement with a particular plan. The optimizer calculates the cost of
access paths and join orders based on the estimated computer resources, which
includes I/O, CPU, and memory.
• Serial plans with higher costs take more time to execute than those with smaller
costs. When using a parallel plan, however, resource use is not directly related to
elapsed time.
• The optimizer compares the costs of the plans and chooses the one with the lowest
cost.
cost based optimizer thru the vers.sql
The Oracle Optimizer: Cost Based Optimization
64. The following features require use of the CBO:
• Partitioned tables and indexes
• Index-organized tables
• Reverse key indexes
• Function-based indexes
• SAMPLE clauses in a SELECT statement
• Parallel query and parallel DML
• Star transformations and star joins
• Extensible optimizer
• Query rewrite with materialized views
• Enterprise Manager progress meter
• Hash joins
• Bitmap indexes and bitmap join indexes
• Index skip scans
The Oracle Optimizer: Cost Based Optimization
65. • Piece of code:
• Estimator
• Plan generator
• Estimator determines cost of optimization suggestions made by the plan
generator:
• Cost: Optimizer’s best estimate of the number of standardized I/Os
made to execute a particular statement optimization
• Plan generator:
• Tries out different statement optimization techniques
• Uses the estimator to cost each optimization suggestion
• Chooses the best optimization suggestion based on cost
• Generates an execution plan for best optimization
The Oracle Optimizer: Cost Based Optimization
66. • Selectivity is the estimated proportion of a row set retrieved by a
particular predicate or combination of predicates.
• It is expressed as a value between 0.0 and 1.0:
• High selectivity: Small proportion of rows
• Low selectivity: Big proportion of rows
• Selectivity computation:
• If no statistics: Use dynamic sampling
• If no histograms: Assume even distribution of rows
• Statistic information:
• DBA_TABLES and DBA_TAB_STATISTICS (NUM_ROWS)
• DBA_TAB_COL_STATISTICS (NUM_DISTINCT, DENSITY,
HIGH/LOW_VALUE,…)
The Oracle Optimizer: Estimator: Selectivity
Selectivity =
Number of rows satisfying a condition
Total number of rows
67. • Expected number of rows retrieved by a particular operation in the
execution plan
• Vital figure to determine join, filters, and sort costs
• Simple example:
• The number of distinct values in DEV_NAME is 203.
• The number of rows in COURSES (original cardinality) is 1018.
• Selectivity = 1/203 = 4.926*e-03
• Cardinality = (1/203)*1018 = 5.01 (rounded off to 6)
cardinality and selectivity in joins.sql
Simple cardinality.sql
The Oracle Optimizer: Estimator: Cardinality
SELECT days FROM courses WHERE dev_name = 'ANGEL;'
Cardinality = Selectivity * Total number of rows
68. The Oracle Optimizer: Estimator: Cost
• Cost is the optimizer’s best estimate of the number of
standardized I/Os it takes to execute a particular statement.
• Cost unit is a standardized single block random read:
• 1 cost unit = 1 SRds
• The cost formula combines three different costs units into
standard cost units.
#SRds*sreadtim + #MRds*mreadtim + #CPUCycles/cpuspeed
sreadtim
Cost=
Single block I/O cost Multiblock I/O cost CPU cost
#SRds: Number of single block reads
#MRds: Number of multiblock reads
#CPUCycles: Number of CPU Cycles
Sreadtim: Single block read time
Mreadtim: Multiblock read time
Cpuspeed: Millions instructions per second
69. The Oracle Optimizer: The Different Modes of the Cost Based Optimizer
Value Description
CHOOSE The optimizer chooses between a cost-based approach and a rule-based approach, depending on whether statistics
are available. This is the default value.
•If the data dictionary contains statistics for at least one of the accessed tables, then the optimizer uses a cost-based
approach and optimizes with a goal of best throughput.
•If the data dictionary contains only some statistics, then the cost-based approach is still used, but the optimizer must
guess the statistics for the subjects without any statistics. This can result in suboptimal execution plans.
•If the data dictionary contains no statistics for any of the accessed tables, then the optimizer uses a rule-based
approach.
ALL_ROWS The optimizer uses a cost-based approach for all SQL statements in the session regardless of the presence of statistics
and optimizes with a goal of best throughput (minimum resource use to complete the entire statement).
FIRST_ROWS_n The optimizer uses a cost-based approach, regardless of the presence of statistics, and optimizes with a goal of best
response time to return the first n number of rows; n can equal 1, 10, 100, or 1000.
FIRST_ROWS The optimizer uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows.
Note: Using heuristics sometimes leads the CBO to generate a plan with a cost that is significantly larger than the cost
of a plan without applying the heuristic. FIRST_ROWS is available for backward compatibility and plan stability.
RULE The optimizer chooses a rule-based approach for all SQL statements regardless of the presence of statistics.
first rows VS all rows.sql
70. The Oracle Optimizer: What Is an Execution Plan?
• The execution plan of a SQL statement is composed of small
building blocks called row sources for serial execution plans.
• The combination of row sources for a statement is called the
execution plan.
• By using parent-child relationships, the execution plan can be
displayed in a tree-like structure (text or graphical).
71. The Oracle Optimizer: Where to Find Execution Plans?
• PLAN_TABLE (EXPLAIN PLAN or SQL*Plus autotrace)
• V$SQL_PLAN (Library Cache)
• V$SQL_PLAN_MONITOR (11g)
• DBA_HIST_SQL_PLAN (AWR)
• STATS$SQL_PLAN (Statspack)
• SQL Management Base (SQL Plan Management Baselines)
• SQL tuning set
• Trace files generated by DBMS_MONITOR
• Event 10053 trace file
• Process state dump trace file since 10gR2
72. The Oracle Optimizer: How To Read?
SQL> explain plan for
2 select e.empno, e.ename, d.dname
3 from emp e, dept d
4 where e.deptno = d.deptno
5 and e.deptno = 10;
Explained.
SQL> SELECT * FROM table(dbms_xplan.display(null,null,'basic'));
PLAN_TABLE_OUTPUT
------------------------------------------------
Plan hash value: 568005898
------------------------------------------------
| Id | Operation | Name |
------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | NESTED LOOPS | |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPT |
| 3 | INDEX UNIQUE SCAN | PK_DEPT |
| 4 | TABLE ACCESS FULL | EMP |
------------------------------------------------
1. Operation 0 is the root of the tree; it has one child,
Operation 1
2. Operation 1 has two children, which is Operation
2 and 4
3. Operation 2 has one child, which is Operation 3
73. The Oracle Optimizer: How To Read?
•Operation 0
(SELECT STATEMENT)
|
|
|
Operation 1
(NESTED LOOPS)
/
/
/
/
/
/
/
/
Operation 2 Operation 4
(TABLE ACCESS (TABLE ACCESS FULL)
BY INDEX ROWID)
|
|
|
Operation 3
(INDEX UNIQUE SCAN)
the graphical representation of the execution plan.
If you read the tree;
In order to perform Operation 1 , you need to
perform Operation 2 and 4.
Operation 2 comes first;
In order to perform 2, you need to perform its Child
Operation 3.
In order to perform Operation 4, you need to
perform Operation 2.
74. Oracle Supports the below access methods.
• Full Table SCAN (FTS)
• Table Access by ROW-ID
• Index Unique Scan
• Index Range Scan
• Index Skip Scan
• Full Index Scan
• Fast Full Index Scans
• Index Joins
• Hash Access
• Cluster Access
• Bit Map Index
optimizer access paths.sql
The Oracle Optimizer: Data Access Methods
75. Guidelines for Managing Indexes
• Create indexes after inserting table data
• Index the correct tables and columns
• Order index columns for performance
• Limit the number of indexes for each table
• Drop indexes that are no longer needed
• Understand deferred segment creation
• Estimate index size and set storage parameters
• Specify the tablespace for each index
• Consider parallelizing index creation
• Consider creating indexes with NOLOGGING
• Understand when to use unusable or invisible indexes
• Consider costs and benefits of coalescing or rebuilding indexes
• Consider cost before disabling or dropping constraints
The Oracle Optimizer: Indexes – Types, Classifications,
Advantages & Disadvantages
76. Index Type Usage
The Oracle Optimizer: Indexes – Types, Classifications, Advantages &
Disadvantages
Default, balanced tree index, good for high-cardinality (high degree of distinct
values) columns
B-tree
Used with clustered tablesB-tree cluster
Used with hash clustersHash cluster
Good for columns that have SQL functions applied to themFunction-based
Good for columns that have SQL functions applied to them; viable alternative
to using a function-based index
Indexed virtual column
Useful to balance I/O in an index that has many sequential insertsReverse-key
Useful for concatenated indexes where the leading column is often repeated;
compresses leaf block entries
Key-compressed
Useful in data warehouse environments with low-cardinality columns; these
indexes aren’t appropriate for online transaction processing (OLTP) databases
where rows are heavily updated.
Bitmap
Useful in data warehouse environments for queries that join fact and
dimension tables
Bitmap join
Global index across all partitions in a partitioned tableGlobal partitioned
Local index based on individual partitions in a partitioned tableLocal partitioned
Specific for an application or cartridgeDomain
77. Physical layout of a table and B-tree index
The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages
78. The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages
When you put indexes on a partitioned table, you have the choice between
GLOBAL and LOCAL .
The LOCAL index partitions follow the table partitions :
They have the same partition key & type, get created automatically when new table partitions are
added and get dropped automatically when table partitions are dropped .
Beware: LOCAL indexes are usually not appropriate for OLTP access on the table, because one
server process may have to scan through many index partitions then .
This is the cause of most of the scary performance horror stories you may have heard about
partitioning!
A GLOBAL index spans all partitions. It has a good SELECT performance usually, but is more
sensitive against partition maintenance than LOCAL indexes. The GLOBAL index needs to be rebuilt
more oftenץ
79. The Oracle Optimizer: Optimizer Statistics
• Describe the database and the objects in the database
• Information used by the query optimizer to estimate:
• Selectivity of predicates
• Cost of each execution plan
• Access method, join order, and join method
• CPU and input/output (I/O) costs
• Refreshing optimizer statistics whenever they are stale is as
important as gathering them:
• Automatically gathered by the system
• Manually gathered by the user with DBMS_STATS
80. The Oracle Optimizer: Optimizer Statistics
A common misperception that if no new statistics are gathered (and
assuming nothing else is altered in the database), that execution
plans must always remain the same.
That by not collecting statistics, one somehow can ensure and
guarantee the database will simply perform in the same manner and
generate the same execution plans.
This is fundamentally not true.
In fact, quite the opposite can be true.
One might need to collect fresh statistics to make sure vital execution
plans don’t change.
It’s the act of not refreshing statistics that can cause execution plans
to suddenly change.
explain plan changes with no stat change.sql
81. The Oracle Optimizer: Types of Optimizer Statistics
• Table statistics:
• Number of rows
• Number of blocks
• Average row length
• Index Statistics:
• B*-tree level
• Distinct keys
• Number of leaf blocks
• Clustering factor
• System statistics
• I/O performance and utilization
• CPU performance and utilization
• Column statistics
• Basic: Number of distinct
values, number of nulls,
average length, min, max
• Histograms (data distribution
when the column data is
skewed)
• Extended statistics
82. The Oracle Optimizer: Histogrms
• The optimizer assumes uniform distributions; this may lead to
suboptimal access plans in the case of data skew.
• Histograms:
• Store additional column distribution information
• Give better selectivity estimates in the case of nonuniform
distributions
• With unlimited resources you could store each different value and
the number of rows for that value.
• This becomes unmanageable for a large number of distinct values
and a different approach is used:
• Frequency histogram (#distinct values ≤ #buckets)
• Height-balanced histogram (#buckets < #distinct values)
• They are stored in DBA_TAB_HISTOGRAMS.
83. The Oracle Optimizer: Frequency Histograms
10 buckets, 10 distinct values
0
10000
20000
30000
40000
1 3 5 7 10 16 27 32 39 49
ENDPOINT VALUE: Column value
ENDPOINT
NUMBER
Cumulative cardinality
# rows for column value
Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49
Number of rows: 40001
84. The Oracle Optimizer: Height-Balanced Histograms
5 buckets, 10 distinct values
(8000 rows per bucket)
0 1 3 4 5
ENDPOINT NUMBER: Bucket number
ENDPOINT VALUE
2
Same number
of rows per bucket
1 7 10 10 32 49
Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49
Number of rows: 40001
Popular value
85. The Oracle Optimizer: Height-Balanced Histograms
In a height-balanced histogram, the ordered column values are divided
into bands so that each band contains approximately the same
number of rows.
The histogram tells you values of the endpoints of each band.
In the example in the slide, assume that you have a column that is
populated with 40,001 numbers.
There will be 8,000 values in each band.
You only have ten distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, and 49.
Value 10 is the most popular value with 16,293 occurrences.
When the number of buckets is less than the number of distinct
values, ENDPOINT_NUMBER records the bucket number and
ENDPOINT_VALUE records the column value that corresponds to
this endpoint.
HISTOGRAMS .sql
87. Buffer cache
For many types of operations, Oracle Database uses the buffer cache to store
data blocks read from disk.
Oracle Database bypasses the buffer cache for particular operations, such as
sorting and parallel reads.
To use the database buffer cache effectively, tune SQL statements for the
application to avoid unnecessary resource consumption.
To meet this goal, verify that frequently executed SQL statements and SQL
statements that perform many buffer gets are well-tuned.
When configuring a new database instance, it is impossible to know the correct
size for the buffer cache.
Typically, a database administrator makes a first estimate for the cache size, then
runs a representative workload on the instance and examines the relevant
statistics to see whether the cache is under-configured or over-configured.
Focusing on benchmark issues: physical IO, logical
reads, shared pool, buffer cache
88. What is a Physical I/O??
Whenever you execute a query, Oracle has to go and fetch data to give you the
result of the query execution.
Here, data means the actual data in data blocks.
Whenever a new data block is requested, it has to be fetched from the physical
datafiles residing on the physical disks.
This fetching of data blocks from the physical disk involves an I/O operation known
as physical I/O.
By virtue of this physical I/O, now the block has been fetched and read into the
memory area called buffer cache.
This is a default action.
We know that a data block might be requested multiple times by multiple queries.
Focusing on benchmark issues: physical IO, logical
reads, shared pool, buffer cache
89. What is a Logical I/O??
Once a physical I/O has taken place and the block has been read into the memory,
the next request for the same data block wont require the block to be fetched from
the disk and hence avoiding a physical I/O.
So now to return the results for the select query requesting the same data block,
the block will be fetched from the memory and is called a Logical I/O.
Whenever the quantum of Logical I/O is calculated, two kinds of reads are
considered : Consistent reads and Current reads.
Jointly, these 2 statistics are known as Logical I/O performed by Oracle.
Focusing on benchmark issues: physical IO, logical
reads, shared pool, buffer cache
90. Consistent reads
It is a well known fact that whenever a change is induced in a data block, the old
data/entry is written to the UNDO/ROLLBACK segments.
From the fundamentals of UNDO, we also know that this is to provide a read
consistent view of the data block to other users trying to read the same data
block.
Consistent reads mean reading the block in a consistent mode “point in time”.
Here the phrase “point in time” means the time when the query/statement began.
A consistent read might or might not involve any UNDO data.
UNDO data will be applied when it is necessary to roll back a data block to the
required “point in time” when the SQL statement was fired.
If on reading the buffer cache, it is found that the data block is already in the
required state, no UNDO data is required because the block is already
consistent.
Focusing on benchmark issues: physical IO, logical
reads, shared pool, buffer cache
91. Consistent reads and array size
Consistent reads could also depend on and vary with the array size setting of SQLPLUS.
The default value is 15.
Array size is the number of rows fetched in a single read.
The value of array size is an indicator of the number of network round trips made to fetch
the required data from Oracle.
A careful adjustment of array size value can improve performance by reducing the
network round trips.
A higher array size might be good for performance of queries (by reducing the network
round trips and also the consistent reads) but too high value also uses more memory.
However, array size is not a setting restricted to SQLPLUS; it can be set in many
other applications requesting data from oracle database.
Focusing on benchmark issues: physical IO, logical
reads, shared pool, buffer cache
92. How do you identify the source of the problem?
93. Solving database performance issues sometimes requires the use of operating system
(OS) utilities.
These tools often provide information that can help isolate database performance
problems.
Consider the following situations:
• You’re running multiple databases and multiple applications on one server and
want to use OS utilities to identify which database (and corresponding process) is
consuming the most operating system resources. This approach is invaluable
when one database application is consuming resources to the point of causing
other databases on the box to perform poorly.
• You need to verify if the database server is adequately sized for current application
workload in terms of CPU, memory, disk I/O, and network bandwidth.
An analysis is needed to determine at what point the server will not be able to
handle larger (future) workloads.
• You’ve used database tools to identify system bottlenecks and want to double-check
the analysis via operating system tools.
How do you identify the source of the problem?
94. In these scenarios, to effectively analyze, tune, and troubleshoot, you’ll
need to employ OS tools to identify resource-intensive processes.
Furthermore, if you have multiple databases and applications
running on one server, when troubleshooting performance issues, it’s
often more efficient to first determine which database and process is
consuming the most resources.
Operating system utilities help pinpoint whether the bottleneck is CPU,
memory, disk I/O, or a network issue.
In Linux/Unix environments, once you have the operating system
identifier, you can then query the database to show
any corresponding database processes and SQL statements.
How do you identify the source of the problem?
97. Mapping a Resource-Intensive Process to a Database Process
Problem
It’s a dark and stormy night, and the system is performing poorly.
You identify an operating system–intensive process on the box.
You want to map an operating system process back to a database process.
If the database process is a SQL process, you want to display the user of the SQL
statement and also the SQL.
Solution
In Linux/Unix environments, if you can identify the resource-intensive operating system
process, then
you can easily check to see if that process is associated with a database process. The
process consists of the following:
1. Run an OS command to identify resource-intensive processes and associated IDs.
2. Identify the database associated with the process.
3. Extract details about the process from the database data dictionary views.
4. If it’s a SQL statement, get those details.
5. Generate an execution plan for the SQL statement.
Solutions: where do you start and what order to
work?
99. Proactive Tuning Methodology
• Simple design
• Data modeling
• Tables and indexes
• Using views
• Writing efficient SQL
• Cursor sharing
• Using bind variables
Introduction to SQL and Application Tuning
100. Simplicity in Application Design
• Simple tables
• Well-written SQL
• Indexing only as required
• Retrieving only required information
Introduction to SQL and Application Tuning
101. Data Modeling
• Accurately represent business practices
• Focus on the most frequent and important
business transactions
• Use modeling tools
• Appropriately normalize data (OLTP versus DW)
Introduction to SQL and Application Tuning
102. Table Design
• Compromise between flexibility and performance:
• Principally normalize
• Selectively denormalize
• Use Oracle performance and management features:
• Default values
• Constraints
• Materialized views
• Clusters
• Partitioning
• Focus on business-critical tables
Introduction to SQL and Application Tuning
103. Index Design
• Create indexes on the following:
• Primary key (automatically created)
• Unique key (automatically created)
• Foreign keys (good candidates)
• Index data that is frequently queried (select list).
• Use SQL as a guide to index design.
Introduction to SQL and Application Tuning
104. Using Views
• Simplifies application design
• Is transparent to the developer
• Can cause suboptimal execution plans
Introduction to SQL and Application Tuning
105. SQL Execution Efficiency
• Good database connectivity
• Minimizing parsing
• Share cursors
• Using bind variables
Introduction to SQL and Application Tuning
106. Writing SQL to Share Cursors
• Create generic code using the following:
• Stored procedures and packages
• Database triggers
• Any other library routines and procedures
• Write to format standards (improves readability):
• Case
• White space
• Comments
• Object references
• Bind variables
Introduction to SQL and Application Tuning
107. Performance Checklist
• Set initialization parameters and storage options.
• Verify resource usage of SQL statements.
• Validate connections by middleware.
• Verify cursor sharing.
• Validate migration of all required objects.
• Verify validity and availability of optimizer statistics.
Introduction to SQL and Application Tuning
109. • Clustering factor
• Integrity Constrains are Important
• Reasons for Inefficient SQL Performance
• Using Bind Variables
• Restructuring SQL Statements
• Shared SQL and Cursors
When and What to Tune?
110. When and What to Tune?
The Clustering Factor
The clustering factor is a number which represent the degree to
which data is randomly distributed in a table.
In simple terms it is the number of “block switches” while reading a
table using an index.
111. When and What to Tune?
The above diagram explains that how scatter the rows of the table are.
The first index entry (from left of index) points to the first data block and
second index entry points to second data block.
So while making index range scan or full index scan, optimizer have to switch
between blocks and have to revisit the same block more than once because
rows are scatter.
So the number of times optimizer will make these switches is actually termed
as“Clustering factor”.
112. When and What to Tune?
The above image represents "Good CF”.
In an event of index range scan, optimizer will not have to jump to next
data block as most of the index entries points to same data block.
This helps significantly in reducing the cost of your SELECT statements.
Clustering factor is stored in data dictionary and can be viewed from
dba_indexes (or user_indexes)
Clustering factor.sql
index fragmentation impact on performance.sql
113. Integrity Constrains are Important
Many people think of constraints as a data integrity thing, and it’s true—
they are.
But constraints are used by the optimizer as well when determining the
optimal execution plan.
The optimizer takes as inputs
• The query to optimize
• All available database object statistics
• System statistics, if available (CPU speed, single-block I/O
speed, and so on—metrics about the physical hardware)
• Initialization parameters
• Constraints
null columns differ from not nul.sql
fk adds to query performance
When and What to Tune?
114. • Reasons for inefficient SQL performance
• Stale or missing optimizer statistics
• Missing access structures
• Suboptimal execution plan selection
• Poorly constructed SQL
When and What to Tune?
115. When and What to Tune?
Richard Morris:
Are there issues that crop up again and again?
Tom Kyte:
Perhaps the biggest issue is the black box approach of development. A developer will learn
everything they can about the procedural language they're using. However, they don't learn
about the database that they're using or other packages that might be involved……
Richard Morris:
Do you think then that poor education is to blame? That somehow it’s got worse over the years
rather than getting better?
Tom Kyte:
No, it hasn’t changed. When I get up on stage at a seminar and I talk about bind variables I
start by saying that for 16 years I’ve been talking about the same thing but each year the
problem is the same. Why? Because universities are trying to teach students theory and
algorithms and things like that, they’re not teaching them how to write production quality code.
They don’t teach them how to debug or how to instrument, they don’t teach them how to
defensively program. They just teach them how to write a compiler in Lisp which frankly doesn’t
translate very well into IT.
116. Using Bind Variables
Oracle automatically notices when applications send similar SQL statements to the database.
The SQL area used to process the first occurrence of the statement is shared- that is, used for
processing subsequent occurrences of that same statement.
Therefore, only one shared SQL area exists for a unique statement.
Because shared SQL areas are shared memory areas, any Oracle process can use a shared SQL
area.
The sharing of SQL areas reduces memory use on the database server, thereby increasing system
throughput.
In evaluating whether statements are similar or identical, Oracle considers SQL statements issued
directly by users and applications as well as recursive SQL statements issued internally by a DDL
statement.
One of the first stages of parsing is to compare the text of the statement with existing statements in
the shared pool to see if the statement can be shared.
If the statement differs textually in any way, then Oracle does not share the statement.
Exceptions to this are possible when the parameter CURSOR_SHARING has been set to SIMILAR
or FORCE.
cursor sharing.sql
When and What to Tune?
117. ADAPTIVE BINDING
DBAs are always encouraging developers to use bind variables, but when bind variables are used
against columns containing skewed data they sometimes lead to less than optimum execution plans.
This is because the optimizer peeks at the bind variable value during the hard parse of the statement,
so the value of a bind variable when the statement is first presented to the server can affect every
execution of the statement, regardless of the bind variable values.
Oracle uses Adaptive Cursor Sharing to solve this problem by allowing the server to compare the
effectiveness of execution plans between executions with different bind variable values.
If it notices suboptimal plans, it allows certain bind variable values, or ranges of values, to use alternate
execution plans for the same statement.
This functionality requires no additional configuration.
When and What to Tune?
118. • Restructuring SQL Statements
reconstruct sql queries.sql
When and What to Tune?
SELECT COUNT(*) FROM products p
WHERE prod_list_price <
1.15 * (SELECT avg(unit_cost) FROM costs c
WHERE c.prod_id = p.prod_id)
SELECT * FROM job_history jh, employees e
WHERE substr(to_char(e.employee_id),2) =
substr(to_char(jh.employee_id),2)
SELECT * FROM orders WHERE order_id_char = 1205
SELECT * FROM employees
WHERE to_char(salary) = :sal
1
2
3
4
SELECT * FROM parts_old
UNION
SELECT * FROM parts_new5
119. Various sql and pl/sql techniques to improve
performance
Advanced SQL and Application Topics