1. Oracle Database Capacity Analysis
Using Statistical Methods
Ajith Narayanan
Dell IT
Nuremberg, Germany, 19th Nov 2013
2. Disclaimer
The views/contents in this slides are those of the author and do not necessarily
reflect that of Oracle Corporation and/or its affiliates/subsidiaries.
The material in this document is for informational purposes only and is published
with no guarantee or warranty, express or implied..
3. Who Am I?
Ajith Narayanan
Dell IT
Bangalore, India
9 years of Oracle [APPS] DBA experience.
Blogger :- http://oracledbascriptsfromajith.blogspot.com
Website Chair:- http://www.oracleracsig.org – Oracle RAC SIG (Sep 2011 –
Sep 2013)
Member:- IOUG, OAUG & AIOUG
4. Agenda
Why system capacity? Are you being served properly?
What Are The Different Statistical Methods Of Analyzing The System Capacity Of Database Tier
Simple Math – CPU Analysis.
Linear Regression Model for CPU.
Queuing Theory – CPU or I/O Subsystems.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Q&A~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5. Why System Capacity? Are You Being Served Properly?
How long can you wait to have your favorite
dish at your favorite restaurant ? Waiter says
you will have to wait for another 1 hour to be
served.
Imagine a angry waiter on your table, refusing
to serve you. (No Service)
Obviously, you are not happy !!
Any computer system serves you in similar
fashion if not properly sized.
Capacity Planning plays a vital role in such
situations.
Proper capacity analysis done by restaurant
owner could have helped the waiter serve you
your favorite dish faster .
6. What are the different statistical methods of analyzing the system
capacity a database server?
Introduction of Capacity Analysis Methods
Simple Math - This model can take single component inputs either application or technical metrics. This method
is usually involved with short-duration projects, The precision is usually low, but sufficient when used appropriately.
Linear Regression Analysis – This method is typically used to determine how much of some business activity
can occur before the system runs out of gas.
Queuing Theory – This method uses Queuing theory used in telecom networks for high precision capacity base
lining & forecasting.
8. Simple Math - CPU
Scenario:- The customer wants to downsize the infrastructure and check
the CPU requirements for a 4-node 16 CPU Database tier.
On Host1 , during peak utilization timings (1/5/09 3:00 AM- 4:00 AM), CPU utilization was 39.27%.
CPU consumption = 0.3927 X 14,400 s (per hour available capacity) = 5654.88s
Similarly, on Host2, Host3, Host4 observed Peak CPU consumption were 43.87% (Avg) 6173.28 s,
41.22 %(Avg) 5935.68 s and 53.11% (Avg) 7503.84s respectively.
Total CPU requirement = 25267.68 s/Hour
Calculate Utilization with 16 CPU’s
~~~~~~~~~~~~~~~~~~~~~~~~
Considering 10% overhead for OS, estimated CPU utilization on 8-core (Dell 1950 QC)
Physical Server = (25267.68 + 25267.68 *10%) / (16*60*60) = 0.4825425 = 48.25%
Calculate Utilization with 12 CPU’s
~~~~~~~~~~~~~~~~~~~~~~~~
Considering 10% overhead for OS , estimated CPU utilization on 4-core (Dell 1950 QC)
Physical Server = (25267.68 + 25267.68 *10%)/ (12*60*60) = 0.64339 = 64.33%
Calculate Utilization with 8 CPU’s
~~~~~~~~~~~~~~~~~~~~~~~~
Considering 10% overhead for OS , estimated CPU utilization on 4-core (Dell 1950 QC)
Physical Server = (25267.68 + 25267.68 *10%)/ (8*60*60) = 0.87735 = 87.73%
11. How many of you think RAM access is 10,000 times faster than Physical disk access?
In real world, LIO is only 25-100 times cheaper than PIO
- …not 1,000s or 10,ooos
Reason – Internal locks & latch serialization mechanisms involved.
Targeting only PIO counts(or high cache hit ratios) during SQL optimization is an important pitfall to
avoid.
Even with no PIOs, a query can still be outrageously inefficient
-LIO are a critical component of query cost
LIO is Expensive?
12. Oracle Logical I/O (LIO)
- Oracle requests a block from the database buffer cache
-LIO statistics are captured in db block gets, consistent gets, session logical reads & buffer is
pinned count statistics
- In raw trace data, It’s cr + cu
Oracle physical I/O (PIO)
Oracle requests one or more blocks from the operating system
Might be “physical,” might not be
PIO statistics are captured in the physical reads statistics
In raw trace data, it’s p or pr
Note:- There are occasions were Oracle manipulates database blocks without using the database buffer
cache (e.g. blocks read by PQO access paths, and blocks read from temporary segments)
Definitions: LIO and PIO
13. Linear Regression Analysis
WHAT IS LINEAR REGRESSION
========================
Linear regression analysis is a method for investigating relationships among variables.
eg. Logical Rds Vs CPU utilization
Relation is y=mx +c (equation of straight line), Here c represents the y-intercept of the line and m
represents the slope.
Here the variable “LIO" is used to predict the value of CPU Utilization .So, “LIO" becomes the explanatory
variable. On the other hand, the variable whose value is to be predicted is known as response or
dependent variable
Generally response variable is denoted as Y & predicted variable denoted as X
14. Linear Regression Analysis
Scenario:- Database capacity forecasting
Observations:
•CPU utilization is highly correlated with Logical Reads. Correlation coefficient 0.98 in (ideal 1.0) . Both the CPU usage & Logical
Reads trend line are similar. (Spikes are matching 98%)
15. Linear Regression Analysis
Scenario:- Database capacity forecasting
Observations:
• Current Capacity can handle additional 2000%
workload by maintaining CPU utilization at 38.62%
(Considering additional OS overhead etc.)
• Moreover, there is a downsizing possibility seen for
reducing the box capacity from 8 CPU to 4 CPU (As
indicated by rightmost column)
19. Queuing Theory
Queuing Theory is basically an upgrade to essential forecasting
mathematics. This method can do a baseline of the current capacity along
with its forecasting(Scalability/Downgrading possibility)
Erlang C Forecasting
===============
Agner Krarup Erlang(1878-1929) is the man behind this math. He studied the performance of
telephone networks.
When Erlang C function is used, we do not apply the essential forecasting response time formulas,
Instead, we have a single new queue time formula that can be applied to both CPU & I/O
subsystem.
For CPU subsystems, there is only one queue, So the entire system arrival rate is (λsys), But for IO
subsystems, the arrival rate at each queue (λq) is the system arrival rate(λsys) divided by the number of
I/O devices.
U= St λq Q= λq Qt Ec= Erlang(m, St, λq) Qt= Ec St
--------- -----------
m m(1-U)
22. Create An Impact With Capacity Analysis
So let’s see an example of 4-cpu Intel boxes and put them together in a cluster with Oracle11G and RAC
on top:
Price for the hardware: About US$15,000 or so.
Price for the OS (Linux): About US$ 0.5- or thereabout (it depends!)
Price for Oracle w/ RAC: US$480,000,-
So that’s half a million to Oracle. Put another way: It’s 1 dollar to the box movers for every 32 dollars paid
for Oracle RAC.
Psychologically it’s hard for the customers to understand that they have to buy something that expensive
to run on such cheap hardware. The gap is too big, and Oracle will need to address it soon.
There’s nothing like RAC on the market, but that doesn’t mean you have to buy RAC. A usual joke exists
that it’s like buying a car for US$10,000,- that has all the facilities you need from a good and stable car.
Airbags and ABS brakes are US$500,000,- extra, by the way. Well, airbags and ABS are wonderful to have
and they increase your security. But it’s a lot of money compared to the basic car price.