Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Troubleshoo4ng
the
Most
Complex
Performance

Issue
I’ve
ever
seen

Tanel
Poder

hAp://blog.tanelpoder.com

hAp://tech.e2sn.com

www.enkitec.com

1

Intro:
About
me

•  Tanel
Põder

Oracle
Database
Performance
geek

Exadata
Performance
geek

Hadoop
Performance
geek

•  Enkitec

•  Consultant

•  Researcher

•  Technology
Evangelist

•  Just
moved
to
Dallas

•  ANer
Tallinn
-‐>
Stockholm
-‐>
London
-‐>

Cancun
-‐>
Singapore

www.enkitec.com

Expert
Oracle
Exadata

book

(with
Kerry
Osborne
and

Randy
Johnson
of
Enkitec)

2

Two
issues
-‐
actually

•  For
warm-‐up:

•  cursor
pin:
S
wait
events
and
sporadic
CPU
spikes

•  Read
more
from
my
blog
entry:

•  hAp://blog.tanelpoder.com/2010/04/21/cursor-‐pin-‐s-‐waits-‐sporadic-‐cpu-‐
spikes-‐and-‐systema4c-‐troubleshoo4ng/

•  Or
just
google
for
“cursor
pin
s”

www.enkitec.com

3

Environment

• 
• 
• 
• 

High-‐concurrency,
high-‐visibility
OLTP
database

Oracle
11.1.0.7
single-‐instance,
dedicated
server
processes

HP-‐UX
on
Itanium

32
CPUs,
128
GB
RAM

•  Thousands
of
end
users

•  Mul4ple
WebLogic
applica4on
servers
talking
to
database
via

connec4on
pools

www.enkitec.com

4

The
problem

• 
• 
• 
• 

Sporadic
extreme
slowness
of
Oracle
DB
and
the
server

Slowness
lasts
for
1
..
20
minutes
at
a
4me…

Queries
don’t
answer
or
extremely
slow

Can’t
even
log
on
to
OS
during
that
4me

•  New
SSH
connec4ons
succeded
once
the
spike
was
over

•  It
takes
minutes
to
run
simple
OS
commands
during
the
problem
4me

•  This
is
a
global
server-‐wide
problem
–
everyone
complains!

So,
the
scope
of
this
problem

is
global,
server-‐wide.

Therefore
we
can
use
global,

server-‐wide
metrics
to

diagnose
the
problem.

www.enkitec.com

5

Let’s
pick
and
diagnose
one
occurrence
of
this
problem

•  The
database
response
4mes
extremely
bad
again
around

18:10
and
this
lasted
for
about
5
minutes…

•  If
it’s
the
users
who
report
the
problem
(as
opposed
to
applica4on

side
measurements),
then
there
may
be
some
discrepancies
in
the

user
reported
4mes
vs
actual
problem
4me

www.enkitec.com

6

Ini4al
AWR
Report

Number
of
sessions

has
grown
by
~500!

Snap Id
Snap Time
Sessions Curs/Sess
--------- ------------------- -------- --------Begin Snap:
61921 30-Oct-10 18:00:10
2,383
28.9
Using
a
20
minute

End Snap:
61922 30-Oct-10 18:20:20
2,863
24.7
report
for
diagnosing

Elapsed:
20.17 (mins)
a
5
minute
problem?!

DB Time:
559.31 (mins)
Avg
% DB
Event
Waits
Time(s)
(ms)
time Wait Class
--------------------------- --------- ----------- ------ ------ ---------db file sequential read
2,135,668
21,468
10
64.0 User I/O
DB CPU
5,860
17.5
log file sync
92,720
1,498
16
4.5 Commit
read by other session
91,676
1,307
14
3.9 User I/O
SQL*Net message from dblink
525
1,132
2155
3.4 Network

This
66%
idle
is
an

average
over
20

minutes!

Host CPU (CPUs:
~~~~~~~~

32 Cores:
32 Sockets:
32)
Load Average
Begin
End
%User
%System
%WIO
%Idle
--------- --------- --------- --------- --------- --------0.37
0.31
16.3
17.6
12.1
66.1
www.enkitec.com

7

ASH
data
(shown
in
OEM)

•  Average
ac4ve
sessions
showed
something
diﬀerent

•  Note
that
this
data
is
from
another
period
of
4me
when
a
similar
spike
happened

•  In
worst
4mes
there
were
up
to
220
ac4ve
sessions
trying
to
be
on
CPU!

•  Thanks
to
beAer
granularity
we
see
the
spikes
instead
of
some
20-‐minute
or
hourly

averages…

•  The
problem
with
ASH
samples
is
that
it
looks
into
session
state
from
inside
Oracle

•  Perhaps
the
starva5on
is
due
to
some
other
applica5on
/
instance
in
the
server?

www.enkitec.com

8

How
many
logons
were
done?

Number
of
sessions

has
grown
by
~500!

Snap Id
Snap Time
Sessions Curs/Sess
--------- ------------------- -------- --------Begin Snap:
61921 30-Oct-10 18:00:10
2,383
28.9
End Snap:
61922 30-Oct-10 18:20:20
2,863
24.7
Elapsed:
20.17 (mins)
DB Time:
559.31 (mins)
Statistic
Total
per Second
-------------------------------- ------------------ -------------index fetch by key
24,174,148
19,971.0
index scans kdiixs1
24,565,055
20,293.9
leaf node 90-10 splits
5,865
4.9
leaf node splits
14,529
12.0
lob reads
34,480
28.5
lob writes
1,623,273
1,341.0
lob writes unaligned
1,623,266
1,341.0
logons cumulative
2,550
2.1
messages received
133,740
110.5
messages sent
133,740
110.5
min active SCN optimization appl
538,358
444.8
no buffer to keep pinned count
6,331
5.2
no work - consistent read gets
146,703,542
121,196.1
opened cursors cumulative
4,168,700
3,443.9

www.enkitec.com

2.1
logons
per

second,
but
we

don’t
know
how

these
logons
are

per Trans
------------distributed
over

43.6
the
20
minute

44.3
period!

0.0
0.0
0.1
2.9
2.9
0.0
0.2
0.2
1.0
0.0
264.7
7.5

9

OS
level
metrics
don’t
lie
(well,
they
do,
but
less
;-‐)

|
|
|
Date
| Time |CPU % |
10/30/2010|17:45:00| 25.06|
10/30/2010|17:50:00| 24.77|
10/30/2010|17:55:00| 24.60|
10/30/2010|18:00:00| 25.95|
10/30/2010|18:05:00| 22.88|
10/30/2010|18:10:00| 66.89|
10/30/2010|18:15:00| 24.51|
10/30/2010|18:20:00| 25.47|
10/30/2010|18:25:00| 28.38|

Phys
|
IO Rt
|
7149.5|
5334.8|
7176.4|
7556.2|
5379.5|
4544.6|
7544.9|
5144.0|
10139.5|

Phys
|Memory|Pg Out | VM Pg
|
KB Rt | %
| Rate | Scan Rt |
145817.6| 74.07|
0.0|
0.0|
60928.0| 73.98|
0.0|
0.0|
186368.0| 73.98|
0.0|
0.0|
192307.2| 74.11|
0.0|
0.0|
67584.0| 74.15|
0.0|
0.0|
58060.8| 76.97|
0.0|
0.0|
159334.4| 76.40|
0.0|
0.0|
59187.2| 75.04|
0.0|
0.0|
151552.0| 74.37|
0.0|
0.0|

1)  What
does
the
Time
18:10:00
mean,
beginning
of
the

monitoring
interval
or
end?

2)  66.89%
busy
during
5
minutes
may
actually
mean
100%
busy

during
~3
minutes
out
of
5,
but
we
don’t
know
that
for
sure

without
measuring
in
more
detail
(beAer
granularity)…

www.enkitec.com

10

Measuring
CPU
u4liza4on
in
more
detail

The
spike
lasted
from

18:11
to
18:14
(3
min)

Around
90%
in

Kernel
mode!!!

www.enkitec.com

11

Checkpoint
–
measured
evidence
so
far

•  Fact:
We
have
a
100%
CPU
u4liza4on
spike,
las4ng
3
minutes

•  Fact:
90%
of
it
is
spent
in
KERNEL
mode

•  Fact:
We
have
over
2500
logons
done
during
20
minute

period

•  2.1
logons
/
second
on
average
(which
doesn’t
sound
bad)

•  Kernel
mode
CPU
usage
is
usually
caused
by
system
calls

•  …or
some
internal
kernel
thread
ac4vity

100%
CPU
usage
doesn’t
always
automa4cally
mean
you
have
a
serious
CPU

starva4on
problem.
The
CPU
runqueue
length
would
indicate
you
how
much

starva4on
(wai4ng
for
CPU
service)
there
is.
However
seeing
90%
of
CPU
used
in

KERNEL
mode
is
deﬁnitely
not
normal
for
an
Oracle
database
server.

www.enkitec.com

12

Diagnosing
90%
kernel-‐mode
CPU
usage
spikes…

1.  Systema4c

• 
• 
• 
• 

Break
down
this
90%
of
Kernel
mode
CPU
usage

Proﬁling!

Oh,
this
is
a
produc4on
system
and
the
problem
is
acute
&
ongoing

On
Solaris,
I’d
have
used
Dtrace
stack()
probe
to
record
OS
kernel

stack
traces
most
common
on
CPU
(google
for
dstackprof)

• 

• 

Or
lockstat
as
it
reports
spins
on
spinlocks
(which
consume
kernel
CPU)

But
this
was
HP-‐UX
and
I
didn’t
know
the
tools
needed

• 
• 

But
I
knew
what
numbers
I
wanted
to
see!

We
sent
a
request
to
HP-‐UX
support:

“How
do
we
measure
&
break
down
where
is
kernel
mode
CPU
used?”

2.  Check
for
usual
suspects

• 

Fast,
cheap
checks
to
rule
out
or
ﬁnd
known
troublemakers

www.enkitec.com

13

Kernel
mode
CPU
usage
spikes
–
the
usual
suspects

•  Before
star4ng
the
systema4c
troubleshoo4ng
&
drilldown,

do
quick
checks
for
usual
suspects

•  Remember,
the
client
has
a
business
problem,
4me
is
of
essence…

1.  Logon
(or
logoﬀ)
storms

• 

Spawning,
ini4alizing
new
processes,
opening
ﬁles
and
aAaching
to

SGA
means
system
calls,
kernel
CPU
usage

2.  Oracle
code
gevng
into
some
crazy
loop
(due
to
a
bug)

• 

Semop(),
yield(),
read
/proc/…,
getrusage(),
etc
loop

3.  OS
kernel
spinlock
conten4on

• 
• 

Not
so
usual
suspects
really…

Variety
of
reasons…

ONen
due
to
bugs
in
OS
or
some
kernel
module

www.enkitec.com

14

Measuring
logon
storms

•  Use
the
AUD$
records
or
“logons
cumula@ve”
number
from

V$SYSSTAT
or
AWR,
right?

•  Wrong!

•  logons
cumula5ve
number
is
incremented
by
the
session
itself

–
aNer
it
has
logged
on,
the
same
applies
to
audit
records!

1. 
2. 
3. 
4. 
5. 

ANer
the
listener
connec4on
has
been
established…

The
process
has
been
started…

It
has
aAached
to
SGA
SHM
segments…

Audit
ﬁle
has
been
wriAen
(if
needed)
…

Process,
session
SGA
structures
have
been
created

• 

Memory
from
OS
and
shared
pool
allocated
(shared
pool
latches!)

6.  Session
has
been
authen4cated

7.  Then
the
logons
cumula4ve
is
incremented!

www.enkitec.com

The
logon
storm
may

have
started
way

before
these
logons

ﬁnally
succeeded!

15

Measuring
logon
storms

•  Logon
storms
should
be
measured
at
the
listener
level

•  Process
listener.log
using
a
script:

$ tail listener.log
30-OCT-2010 23:22:03
30-OCT-2010 23:22:08
30-OCT-2010 23:22:08
30-OCT-2010 23:22:09

*
*
*
*

(CONNECT_DATA=…)
(CONNECT_DATA=…)
(CONNECT_DATA=…)
service_update *

* establish * E2SNDB * 0
E2SNDB * 0

$ fgrep "30-OCT-2010" listener.log | fgrep "establish" |
awk '{ print $1 " " $2 }' | awk -F: '{ print $1 }'
|
sort | uniq –c
…
5564 30-OCT-2010 20
5571 30-OCT-2010 21
5703 30-OCT-2010 22
5703
logons
during

4828 30-OCT-2010 23
22:00
–
22:59

…
www.enkitec.com

16

Measuring
logon
storms

•  Logon
storms
should
be
measured
at
the
listener
level

•  Process
listener.log
using
a
script:

$ fgrep "30-OCT-2010 22:" listener.log | fgrep "establish"
|
awk '{ print $1 " " $2 }' | awk -F: '{ print $1 ":" $2 }' |
sort | uniq –c
…
88 30-OCT-2010 22:00
120 30-OCT-2010 22:01
94 30-OCT-2010 22:02
How
many
connec4ons

94 30-OCT-2010 22:03
listener
established
every

95 30-OCT-2010 22:04
minute
(this
data
from
non-‐
120 30-OCT-2010 22:05
problem
4me)

79 30-OCT-2010 22:06
101 30-OCT-2010 22:07
85 30-OCT-2010 22:08
100 30-OCT-2010 22:09
85 30-OCT-2010 22:10
89 30-OCT-2010 22:11
83 30-OCT-2010 22:12
93 30-OCT-2010 22:13
www.enkitec.com

17

Con4nuing
the
OS
kernel
mode
CPU
usage
diagnosis

Let’s
trace
system

calls
by
one
CPU

heavy
Oracle
process

# tusc -cp 8021
( Attached to process 8021 ("oracleXYZ (LOCAL=NO)") [64-bit] )
( Detaching from process 8021 ("oracleXYZ (LOCAL=NO)") )
Syscall
open
----Total

Interes4ngly
3
open()
syscalls

take
over
a
second
in
total.

Could
this
be
caused
by
the

general
CPU
starva4on
in
the

server?

Seconds
1.05
----1.05

Calls
3
----3

The
next
step
should
have

been
to
check
which
ﬁle
did

we
try
to
open
(but
the
spike

ended
before
we
could
do

that)

www.enkitec.com

Errors
3
----3

Seeing
errors
isn’t
a

problem
itself,
as
this

is
how
Oracle
and
libs

some4mes
check
for

existence
of
a
ﬁle…

18

Con4nuing
the
OS
kernel
mode
CPU
usage
diagnosis

•  HP-‐UX
support
got
back
to
us
and
recommended
the
use
of

runki
tool
for
measuring
kernel
CPU
u4liza4on
in
detail

•  It
had
to
be
installed
as
root

•  It
measured
a
lot
of
things
happening
in
kernel,
wri4ng
a
big
output

ﬁle

•  on
Solaris
there’s
a
tool
called
TNF
trace,
AIX
has
trace
command
for
such

full
kernel
tracing

•  Now
we
had
to
wait
for
the
problem
to
happen
again

•  Didn’t
have
to
wait
for
too
long…

•  We
sent
the
raw
trace
dump
to
HP
Support,
so
they
could
run

something
like
“tkprof”
on
that
traceﬁle

•  It
basically
just
summed
up
the
kernel
spinlock
wait,
spin
events
by
lock,

object
locked
etc

www.enkitec.com

19

kiprof
–
proﬁled
runki
output

Total Hardclock traces: 30239
================================
State
Count
Percent
USER
7130
23.58
SYS
22965
75.94
IDLE
24
0.08
SSYS
120
0.40

That’s
basically

spinning
for
locks

(adap4ve
decisions
to

spin
or
wait)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Kernel Functions executed during profile
Count
Pct State Function
Self-‐explanatory.
We

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15615 51.64% SYS
timed_preArbitration
are
spinning
for
a
lock

7130 23.58% USER
OTHER
1275
4.22% SYS
spinlock
1008
3.33% SYS
wait_for_lock_spinner
VxFS
directory
block

488
1.61% SYS
vx_dirbread
read?

464
1.53% SYS
vx_bc_getblk
417
1.38% SYS
preArbitration
338
1.12% SYS
vx_dirscan
VxFS
directory

291
0.96% SYS
vx_bc_brelse

contents
scan
!!!

www.enkitec.com

20

Drilling
down
to
kernel
spinlock
spinning

The
main
spinlock
experiencing
spinning
was
related
to
VxFS:

spn%kern
7.24
0.11
0.04
0.04
0.01

cumpct
7.24
7.36
7.40
7.44
7.45

seconds
71.96
1.14
0.44
0.42
0.12

spn%cpu lock name
3.77 FS:vxfs:bc_freelist_lock spin
0.06 FS:vxfs:inode spin for sleep lock
0.02 FS:vxfs:i_spinspin lock
0.02 Sleep Queue lock
0.01 v_count_lock

Oracle
processes
were
the
main
ones
spinning:

spn%spn
99.17
0.60
0.08
0.05
0.03
0.03

cumpct
99.17
99.77
99.85
99.91
99.93
99.96

spnsec
73.82
0.45
0.06
0.04
0.02
0.02

usrsec
880.32
0.00
1.42
0.00
0.00
0.00

kernsec spn%kern process name
543.70
13.58 oracle
7.37
6.11 vxfsd
11.52
0.52 tnslsnr
402.60
0.01 [IDLE]
0.50
4.00 xyz
0.43
4.65 sadc

www.enkitec.com

21

Which
system
calls?

$ /opt/ktools/bin/kiinfo -kitrace pid=28679,syscalls -ts
1103_1307 | grep open | sort -r -n -k 13,13 | more
2.550292 cpu=6 seqcnt=43808107 pid=28679 ktid=526366 utid=1
open [5] err=17 ret1=12 ret2=1 syscallbeg=
3.295138
ktcsys=
0.000000 A0=0x9fffffff A1=
0xffffceb0 A2=0 A3=0xd02 A4=0 A5=432 fname='/opt/apps/oracle/
admin/xyzdb/adump/xyzdb_ora_28679_21.aud'
3.172851
ktcsys=
0.000000 A0=0x9fffffff A1=
3.066686
ktcsys=
0.000000 A0=0x9fffffff A1=
2.576237
ktcsys=
0.000000 A0=0x9fffffff A1=
www.enkitec.com

22

Audit
file
des4ona4on

•  New
audit
file
name
format
in
11g…

•  A
new
file
is
created
for
each
audit
file
where
the
SPID
collides
with
a

previous
file…

•  Every
@me
when
crea@ng
a
new
audit
file,
Oracle
has
to
check
whether

such
file
already
exists
with
suffix
_1,
then
_2,
_3,
etc..

$ cd /u01/app/oracle/admin/E2SNDB/adump
$ ls -l | head
total 4788
-rw-r----- 1 oracle dba 735 Feb 28 16:06
-rw-r----- 1 oracle dba 710 Oct 16 17:58
-rw-r----- 1 oracle dba 735 Dec 22 21:28
www.enkitec.com

e2sndb_ora_10028_1.aud

23

Shouldn’t
the
audit
files
be
created
only
for
SYSDBA
and

SYSOPER
access?

•  In
theory,
yes.

•  In
prac4ce
in
our
case,
no.

•  Bug
9744092:
EXCESSIVE
AMOUNT
OF
AUD
FILES
BEING

GENERATED
IN
11.1

•  Oracle
generated
a
new
.aud
file
for
every
new
database
connec4on!

•  Not
just
SYSOPER/SYSDBA
like
normally

•  This
is
all
despite
having
audit_trail
=
DB

•  Normally
the
.aud
files
in
audit_dump_dest
are
not
related
to
regular

audit
trail,
but
for
SYSOPER/SYSDBA
logon,
startup/shutdown
audi4ng

•  …and
if
AUDIT_SYS_OPERATIONS
=
TRUE
then
all
commands
issued
as
SYS

www.enkitec.com

24

Bug
9744092:
EXCESSIVE
AMOUNT
OF
AUD
FILES
BEING

GENERATED
IN
11.1

PROBLEM:
-------After upgrade to 11.1 the system creates 10 - 16
*aud files per minute in audit_file_dest.
Out of 9528 *.aud files that the customer uploaded,
9124 files recorded ACTION:[3] "102".
DIAGNOSTIC ANALYSIS:
-------------------The change of behavior (move audit action 102 from
aud$ to audit_file_dest when audit_trail=DB.) is due
to a fix for an unpublished bug 5476184 in 11.1.
It is not an intended feature for 11G.
WORKAROUND:
----------Manual delete of audit files
www.enkitec.com

25

Diagnosis

1.  Thanks
to
bug
9744092
and
a
behavior
change
in
Oracle
11.1

a
new
audit
file
was
created
for
each
new
connec4on
to
DB

• 

If
a
file
already
existed,
Oracle
checked
if
a
similar
file
name
with

larger
suffix
value
(_2,
_3,
_4
etc)
existed

2.  The
audit_file_dest
eventually
had
over
1.5
M
files
in
it!

• 

For
each
logon,
mul4ple
file
existence
checks
(open()
syscalls)
had
to

be
done

3.  Checking
whether
a
file
existed
(open
syscall
-‐>
directory

entry
scan)
became
very
slow
–
and
it’s
done
in
kernel
mode

• 
• 

A
spinlock
was
held
during
the
directory
entry
scan

Other
new
Oracle
processes
also
wanted
to
do
the
directory
scan,

resul4ng
in
spinlock
conten4on
and
further
Kernel
mode
CPU
usage

4.  When
the
DB
got
slow
–
app
servers
fired
up
hundreds
of

new
connec4ons
to
“make
things
faster”

• 

This
all
fed
back
to
the
problem
–
even
more
conten4on
&
spinning

www.enkitec.com

26

Limi4ng
logon
storms

Use
Oracle
Listener
connec4on
rate
limiter
(11gR1+)

listener.ora:
LISTENER=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521)(RATE_LIMIT=5))
(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1522)(RATE_LIMIT=10))
(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1523))
)

Oracle
Documenta5on:
Oracle
Net
Listener
Parameters
(listener.ora)

hQp://download.oracle.com/docs/cd/B28359_01/network.111/b28317/listener.htm

Also,
it
is
possible
to
limit
logoﬀ
storm
rate

_logout_storm_rate
parameter
(instance-‐wide)

www.enkitec.com

27

Troubleshoo4ng
sporadic
system
performance
issues

Right
Data
!!!

•  Right
scope
–
if
your
problem
lasts
for
seconds,
this
should
be

the
granularity
of
your
data
too

•  OS
level
data,
in
addi4on
to
the
database
metrics

•  Ideally
OS
level
metrics
sampled
mul4ple
4mes
per
minute

www.enkitec.com

28

Conclusions

•  Logon
storms
are
evil!

•  They
will
amplify
any
performance
hiccups
as
they
cause
extra
load

just
when
the
resources
are
scarcest

•  Connec4on
pools
ﬁring
up
hundreds
of
new
connec4ons
are

evil!

•  Know
your
limits
(both
max
connec4ons
and
max
connect
rate
/
sec)

•  Here’s
a
thought:

•  If
you
have
planned
the
servers’
capacity
to
support
N-‐thousand

connec4ons
anyway
(by
allowing
connec4on
pools
grow
that
high),
why

not
create
this
amount
of
connec4ons
right
away?

•  This
would
avoid
logon
storms
during
worst
4mes
as
all
connec4ons
have

already
been
created!

www.enkitec.com

29

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Similar to Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1 (20)

More from Tanel Poder

More from Tanel Poder (11)

Recently uploaded

Recently uploaded (20)

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1