Oracle RAC 12c Rel. 2 for Continuous Availability

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Oracle
Real
Applica@on
Clusters

(RAC)
12c
Release
2
–

For
Con@nuous
Availability

Markus
Michalewicz

Senior
Director
of

Product
Management,

Oracle
RAC
Development

Markus.Michalewicz@oracle.com

@OracleRACpm

hQp://www.linkedin.com/in/markusmichalewicz

hQp://www.slideshare.net/MarkusMichalewicz

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Safe
Harbor
Statement

The
following
is
intended
to
outline
our
general
product
direc@on.
It
is
intended
for

informa@on
purposes
only,
and
may
not
be
incorporated
into
any
contract.
It
is
not
a

commitment
to
deliver
any
material,
code,
or
func@onality,
and
should
not
be
relied
upon

in
making
purchasing
decisions.
The
development,
release,
and
@ming
of
any
features
or

func@onality
described
for
Oracle’s
products
remains
at
the
sole
discre@on
of
Oracle.

3

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

Edi@on-‐based
Redefini@on,

Online
Redefini@on,
Data
Guard,
GoldenGate

– 
Minimal
down+me
maintenance,
upgrades,
migra+ons

Ac@ve
Data
Guard

– Data
Protec+on,
DR

– Query
Offload

GoldenGate

– Ac+ve-‐ac+ve
replica+on

– Heterogeneous

Ac@ve
Replica

Oracle
Maximum
Availability
Architecture
(MAA)

RMAN,
Oracle
Secure
Backup

–  Backup
to
disk,
tape
or
cloud

Enterprise
Manager
Cloud
Control

– Coordinated
Site
Failover

Applica@on
Con@nuity

– Applica+on
HA

Global
Data
Services

– Service
Failover
/
Load
Balancing

RAC

– Scalability

– Server
HA

Flashback

– Human
error

correc+on

Produc@on
Site

ASM

– ASM
mirroring

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

Edi@on-‐based
Redefini@on,

Online
Redefini@on,
Data
Guard,
GoldenGate

– 
Minimal
down+me
maintenance,
upgrades,
migra+ons

Ac@ve
Data
Guard

– Data
Protec+on,
DR

– Query
Offload

GoldenGate

replica+on

– Heterogeneous

Ac@ve
Replica

Oracle
Maximum
Availability
Architecture
(MAA)

RMAN,
Oracle
Secure
Backup

–  Backup
to
disk,
tape
or
cloud

Enterprise
Manager
Cloud
Control

– Coordinated
Site
Failover

Applica@on
Con@nuity

– Applica+on
HA

Global
Data
Services

– Service
Failover
/
Load
Balancing

RAC

– Scalability

– Server
HA

Flashback

– Human
error

correc+on

Produc@on
Site

Edi@on-‐based
Redefini@on,

Online
Redefini@on,
Data
Guard,
GoldenGate

– 
Minimal
down+me
maintenance,
upgrades,
migra+ons

Ac@ve
Data
Guard

– Data
Protec+on,
DR

– Query
Offload

GoldenGate

replica+on

– Heterogeneous

Ac@ve
Replica

RMAN,
Oracle
Secure
Backup

–  Backup
to
disk,
tape
or
cloud

Enterprise
Manager
Cloud
Control

– Coordinated
Site
Failover

Applica@on
Con@nuity

– Applica+on
HA

Global
Data
Services

– Service
Failover
/
Load
Balancing

RAC

– Scalability

– Server
HA

Flashback

– Human
error

correc+on

Produc@on
Site

ASM

– ASM
mirroring

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Program
Agenda

High
Availability
Improvements

Con@nuous
Availability
Features

1

2

6

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Program
Agenda

High
Availability
Improvements

Con@nuous
Availability
Features

1

2

7

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Reduced
failure
detec(on
@me

for
an
increased
number
of

monitored
components

8

Reduced
(me
to
recover

from
local
failures
due
to

reduced
reconﬁgura@on
@mes

Preven(on
of
system
or
database

failures
using
ML-‐based
real-‐@me

analysis
of
diagnos@c
data

RAC
High
Availability
Improvements

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Reduced
failure
detec(on
@me

for
an
increased
number
of

monitored
components

9

Reduced
(me
to
recover

from
local
failures
due
to

reduced
reconﬁgura@on
@mes

Preven(on
of
system
or
database

failures
using
ML-‐based
real-‐@me

analysis
of
diagnos@c
data

RAC
High
Availability
Improvements

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

More
Components
Checked
More
Frequently

•  Oracle
Clusterware
checks

– more
components

•  Mul@ple
public
networks
checked
with
Ping
Targets

– more
frequently

•  VIPs
checked
every
second

•  30
secs
CSS
misscount
default,
zero
brownout
allows
for
less

– more
efficiently

•  Agent
changes
allow
for
more
checks
using
lesser
resources

•  Data
from
auxiliary
systems
are
taken
into
account

•  Engineered
System-‐op(mized
failure
detec(on
and
fencing

– and
offline

•  Offline
monitoring
of
failed
components
for
faster
recovery

– to
detect
failures
sooner
and
to
recover
faster

10

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Reduced
failure
detec(on
@me

for
an
increased
number
of

monitored
components

11

Reduced
(me
to
recover

from
local
failures
due
to

reduced
reconﬁgura@on
@mes

Preven(on
of
system
or
database

failures
using
ML-‐based
real-‐@me

analysis
of
diagnos@c
data

RAC
High
Availability
Improvements

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Smart
Fencing

12

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
13

•  Pre-‐12.2,
node
evic@on
follows

a
rather
“ignorant”
paQern

–  Example
in
a
2-‐node
cluster:
The
node

with
the
lowest
node
number
survives.

•  Customers
must
not
base
their

applica@on
logic
on
which
node

survives
the
split
brain.

–  As
this
may(!)
change
in
future
releases

Node
Evic@on
Basics

h=p://www.slideshare.net/MarkusMichalewicz/oracle-‐clusterware-‐node-‐management-‐and-‐vo(ng-‐disks

✔

1
2

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
14

•  Node
Weigh@ng
is
a
new
feature
that
considers

the
workload
hosted
in
the
cluster
during
fencing

•  The
idea
is
to
let
the
majority
of
work
survive,

if
everything
else
is
equal

–  Example:
In
a
2-‐node
cluster,
the
node
hos@ng
the

majority
of
services
(at
fencing
@me)
is
meant
to
survive

Node
Weigh@ng
in
Oracle
RAC
12c
Release
2

Idea:
Everything
equal,
let
the
majority
of
work
survive

✔

1
2

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

A
three
node
cluster

will
benefit
from
“Node
Weigh@ng”,

if
three
equally
sized
sub-‐clusters
are

built
as
s
result
of
the
failure,
since

two
differently
sized
sub-‐clusters
are

not
equal.

15

Secondary
failure
considera(on

can
influence
which
node
survives.

Secondary
failure
considera@on

will
be
enhanced
successively.

A
fallback
scheme

is
applied
if
considera@ons
do
not

lead
to
an
ac@onable
outcome.

Let’s
Define
“Equal”

✔

Public
network

card
failure.

“Conflict”.

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

CSS_CRITICAL

can
be
set
on
various
levels
/

components
to
mark
them
as

“cri@cal”
so
that
the
cluster
will
try
to

preserve
them
in
case
of
a
failure.

16

CSS_CRITICAL
will
be
honored

if
no
other
technical
reason

prohibits
survival
of
the
node

which
has
at
least
one
cri@cal

component
at
the
@me
of
failure.

A
fallback
scheme
is
applied
if

CSS_CRITICAL
sepngs
do
not
lead

to
an
ac@onable
outcome.

CSS_CRITICAL
–
Fencing
with
Manual
Override

crsctl
set
server

css_cri(cal
{YES|NO}

+
server
restart

srvctl
modify
database
-‐help

|grep
cri@cal

…

-‐css_cri@cal
{YES
|
NO}

Define
whether
the
database

or
service
is
CSS
cri@cal

✔

Node
evic@on

despite
WL;
WL

will
failover.

“Conflict”.

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Recovery
Buddies

17

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|
18

•  Recovery
Buddies

•  Track
block
changes
on
buddy
instance

•  Quickly
iden@fy
blocks
requiring

recovery
during
reconfigura@on

•  Allow
rapid
processing
of

transac@ons
awer
failures

Near
Zero
Reconfigura@on
Time
with
Recovery
Buddies

A.k.a.
Buddy
Instances

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
19

•  Buddy
Instance
mapping
is
simple
(random)

–  e.g.
I1
à
I2,
I2
à
I3,
I3
à
I4,
I4
à
I1

•  Recovery
buddies
are
assigned
during
startup

•  RMS0
on
each
recovery
buddy
instance
maintains

an
in-‐memory
area
for
redo
log
change

•  An
in-‐memory
area
is
used
during
recovery

–  Eliminates
the
need
to
physically
read
the
redo

Near
Zero
Reconﬁgura@on
Time
with
Recovery
Buddies

How
it
works
under
the
hood

Instance

I1

Instance

I2

Instance

I3

Instance

I4

Recovery

Buddy
I3

Recovery

Buddy
I4

Recovery

Buddy
I1

MyCluster

Recovery

Buddy
I2

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

How
Recovery
Buddies
Help
Reducing
Recovery
Time

Without
Recovery
Buddies
With
Recovery
Buddies

20

Detect

Evict

Elect

Recovery

Read

Redo

Apply

Recovery

Detect

Evict

Elect

Recovery

Read

Redo

Apply

Recovery

Up
to

4x

faster

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Database

Hang
Manager

21

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Overlooked
and
Underes@mated
–
Hang
Manager

•  Customers
experience
database
hangs
for
a
variety
of
reasons

–  High
system
load,
workload
conten@on,
network
conges@on,
general
errors,
etc.

•  Before
Hang
Manager
was
introduced
with
Oracle
RAC
11.2.0.2

–  Oracle
required
quite
some
informa@on
to
troubleshoot
a
hang
-‐
e.g.:

•  System
state
dumps

•  For
RAC:
global
system
state
dumps

–  Customer
usually
had
to
reproduce
“the”
hang
with
addi@onal
events
to
analyze
it

22

Why
having
a
Hang
Manager
is
useful

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
23

•  Always
on,
as
enabled
by
default

•  Reliably
detects
database
hangs

•  Autonomically
resolves
hangs

•  Considers
QoS
policies
for
hang
resolu@on

•  Logs
all
detected
hangs
&
their
resolu@ons

Introduc@on
to
Hang
Manager

How
it
works
Session

DIAG0

EVALUATE
DETECT
ANALYZE
Hung?

VERIFY
Vic(m

QoS

Policy

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
24

•  Hang
Manager
auto-‐tunes
itself
by

periodically
collec@ng
instance-‐and

cluster-‐wide
hang
sta@s@cs

•  Metrics
like
cluster
health/instance

health
is
tracked
over
a
moving
average

•  This
moving
average
is

considered
during
resolu@on

•  Holders
wai@ng
on
SQL*Net

break/reset
are
fast
tracked

Hang
Manager
Op@miza@ons
with
Oracle
RAC
12c

Tuning
under
the
hood

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
25

•  Early
warning
exposed
via
(V$
view)

•  Sensi@vity
can
be
set
higher

–  If
the
default
level
is
too
conserva@ve

•  Hang
Manager
considers
QoS
policies

and
data
during
the
valida@on
process

DBMS_HANG_MANAGER.Sensi@vity

A
new
SQL
interface
to
set
Hang
Manager
sensi@vity

Hang

Sensi(vity

Level

Descrip(on
Note

NORMAL
Hang
Manager
uses
its

default
internal
opera@ng

parameters
to
try
to
meet

typical
requirements
for
any

environments.

Default

HIGH
Hang
Manager
is
more
alert

to
sessions
wai@ng
in
a
chain

than
when
sensi@vity
is
in

NORMAL
level.

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Reduced
failure
detec(on
@me

for
an
increased
number
of

monitored
components

26

Reduced
(me
to
recover

from
local
failures
due
to

reduced
reconﬁgura@on
@mes

Preven(on
of
system
or
database

failures
using
ML-‐based
real-‐@me

analysis
of
diagnos@c
data

RAC
High
Availability
Improvements

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Oracle
Autonomous
Health
Framework
(AHF)

•  Integrates
next
genera@on
tools
running

as
components
-‐
24/7

•  Discovers
Poten@al
Issues
and
No@ﬁes

or
takes
Correc@ve
Ac@ons

•  Speeds
up
Issue
Diagnosis
and
Recovery

•  Preserves
Database
and
Server

Availability
and
Performance

•  Autonomously
Monitors
and
Manages

resources
to
maintain
SLAs

27

Working
for
You
Con(nuously

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

AHF
–
Availability
by
Pla}orm

28

Linux
x86-‐64
zLinux
Solaris
(Sparc)
HP-‐UX
Itanium
IBM
AIX
Windows
z86-‐64

Cluster

Veriﬁca(on

U(lity
(CVU)

✔ ✔
V:
March
2015
✔ ✔
V:
August
2015

✔
V:
August
2015

✔
V:
August
2015

ORAchk
✔
✔
✔
✔
✔
✔

Cluster
Health

Monitor
(CHM)

✔ ✗
Not
planned

✔ ✗
Not
planned

✔ ✔
Cluster
Health

Advisor
(CHA)

✔
Since
12.2.0.1

✗
Not
planned

✗
Future
Release

✗
Not
planned

✗
Future
Release

✗
Not
planned

Trace
File

Analyzer
(TFA)

✔
✔
✔
✔
(no
TFA
web)

✔
✔
(no
TFA
web)

Hang
Manager
✔
✔
✔
✔
✔
✔

Memory
Guard
✔ ✗
Not
planned

✔ ✗
Not
planned

✔ ✔
Quality
of
Service

Management
(QOS)

✔ ✗
Not
planned

✔ ✗
Not
planned

✔ ✔

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
29

Generates
Diagnos(c
Metrics
View
of
Cluster
and
Databases

Cluster
Health
Monitor
(CHM)

•  Always
on
-‐
Enabled
by
default

•  Provides
Detailed
OS
Resource
Metrics

•  Assists
Node
evic@on
analysis

•  Locally
logs
all
process
data

•  User
can
deﬁne
pinned
processes

•  Listens
to
CSS
and
GIPC
events

•  Categorizes
processes
by
type

•  Supports
plug-‐in
collectors
(ex.

traceroute,
netstat,
ping,
etc.)

•  New
CSV
output
for
ease
of
analysis

GIMR

ologgerd

(master)

osysmond

12c
Grid
Infrastructure

Management
Repository

OS
Data

osysmond

osysmond

OS
Data

OS
Data

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

Introducing
Oracle
12c
Cluster
Health
Advisor
(CHA)

•  Real
@me
monitoring
of
Oracle
RAC
database
systems
and
their
hosts

•  Early
detec@on
of
impending
as
well
as
ongoing
system
faults

•  Diagnoses
and
iden@fies
the
most
likely
root
causes

•  Provides
correc@ve
ac@ons
for
targeted
triage.

•  Generates
alerts
and
no@fica@ons
for
rapid
recovery

30

Proac(ve
Health
Prognos(cs
System

Full
presenta@on:

hQp://www.oracle.com/technetwork/database/op@ons/clustering/ahf/learnmore/oracle-‐12cr2-‐cha-‐3623186.pdf

Recorded
WebSeminar:

hQps://www.youtube.com/watch?v=TbdkGsmSgcQ

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Cluster
Health
Advisor
(CHA)
Architecture
Overview

31

OS
Data

GIMR

ochad

DB
Data

CHM

Node

Health

Prognos(cs

Engine

Database

Health

Prognos(cs

Engine

OS

Model

DB

Model

•  cha
–
Cluster
node
resource

•  Single
Java
ochad
daemon
per
node

•  Reads
Cluster
Health
Monitor
data

directly
from
memory

•  Reads
DB
ASH
data
from
SMR
w/o
DB
connec@on

•  Uses
OS
and
DB
models
and
data
to
perform

prognos@cs

•  Stores
analysis
and
evidence
in
the
GI

Management
Repository

•  Sends
alerts
to
EMCC
Incident
Manager
per

target

EMCC

Alert

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

Cluster
Health
Advisor
-‐
Scope
of
Problem
Detec@on

•  Over
30
node
and
database
problems
have
been
modeled

•  Over
150
OS
and
DB
metric
predictors
iden@fied

•  Problem
Detec@on
in
12.2.0.1
includes

– Interconnect
,
Global
Cache
and
Cluster
Problems

– Host
CPU
and
Memory
,
PGA
Memory
stress

– IO
and
Storage
Performance
issues

– Reconfigura@on
and
Recovery
issues

– Workload
and
Session
abnormal
varia@ons

32

Best
Effort
Immediate
Guided
Diagnosis

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|
33

Data
Sources
and
Data
Points

Cluster
Health
Advisor

Time
CPU
ASM

IOPS

Network

%
u(l

Network_
Packets

Dropped

Log

file

sync

Log
file

parallel

write

GC

CR

request

GC
current

request

GC
current

block
2-‐way

GC
current

block
busy

Enq:
CF

-‐
conten
(on

…
15:16:00
0.90
4100
13%
0
2
ms
600
us
0
0
300
us
1.5
ms

0

A
CHA
Data
Point
contains

150
signals
(sta@s@cs
and
events)
from
mul+ple
sources

OS,
ASM
,
Network
DB
(
ASH,
AWR
session,
system
and
PDB
sta(s(cs
)

Sta@s@cs
are
collected
at
a
1
second
internal
sampling
rate
,
synchronized,

smoothed
and
aggregated
to
a
Data
Point
every
5
seconds

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|
34

Models
Capture
the
Dynamic
Behavior
of
all
Normal
Opera?on

Models
Capture
all
Normal
Opera@ng
Modes

0

5000

10000

15000

20000

25000

30000

35000

40000

10:00
2:00
6:00

5100

9025

4024

2350

4100

22050

10000

21000

4400

2500

4900

800

IOPS

user
commits
(/sec)

log
file
parallel
write
(usec)

log
file
sync
(usec)

•  Release
ships
with
conserva@ve
models
to
minimize
false
warnings

•  A
model
captures
the
normal
load
phases
and
their
sta@s@cs
over
@me,
and
thus
the
characteris@cs
for
all
load

intensi@es
and
profiles.
During
monitoring,
any
data
point
similar
to
one
of
the
vectors
is
NORMAL.

•  One
could
say
that
the
model
REMEMBERS
the
normal
opera?onal
dynamics
over
?me

In-‐Memory
Reference
Matrix

(Part
of
“Normality”
Model)

IOPS
####
2500
4900
800
####

User
Commits
####
10000
21000
4400
####

Log
File
Parallel

Write
####
2350
4100
22050
####

Log
File
Sync
####
5100
9025
4024
####

…
…
…
…
…
…

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
35

CHA
Model:
Find
Similarity
with
Normal
Values

Cluster
Health
Advisor

Observed
values

(Part
of
a
Data
Point)

CHA
es(mator/predictor:
“based
on
my
normality
model,
the
value
of
IOPS
should
be
in
the
vicinity
of
~

4900,
but
it
is
reported
as
10500,
this
is
causing
a
residual
of
~
5600
in
magnitude”,

CHA
fault
detector:
“such
high
magnitude
of
residuals
should
be
tracked
carefully!
I’ll
keep
an
eye
on
the

incoming
sequence
of
this
signal
IOPS
and
if
it
remains
deviant
I’ll
generate
a
fault
on
it”.
In-‐Memory
Reference
Matrix

(Part
of
“Normality”
Model)

IOPS
####
2500
4900
800
####

User
Commits
####
10000
21000
4400
####

Log
File
Parallel

Write
####
2350
4100
22050
####

Log
File
Sync
####
5100
9025
4024
####

…
…
…
…
…
…

10500

20000

4050

10250

…

Residual
Values

(Part
of
a
Data
Point)

5600

-‐1000

-‐50

325

…

Observed
-‐

Predicted
=

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Cluster
Health
Advisor
(CHA)
Opera@on
Overview

36

•  SRVCTL
lifecycle
daemon
management

•  Enabled
by
default
-‐
Ac@vates
when
1st

RAC
instance
starts

•  New
CHACTL
command
line
tool
for
all

local
opera@ons

•  Java
GUI
Tool
available
on
OTN
soon

•  Integrated
into
EMCC
Incident
Manager

and
no@ﬁca@ons

•  Monitoring
has
no
impact
on

DB
performance
or
availability

CHACTL
Client

CHA
Java
GUI
Client

SRVCTL

OS
Data

GIMR

DB
Data

CHM

Node

Health

Prognos(cs

Engine

Database

Health

Prognos(cs

Engine

OS

Model

DB

Model

Local
to
Cluster

EM

Cloud

Control

CHADDriver

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

CHA
Command
Line
Opera@ons

37

Checking
for
Health
Issues
and
Correc(ve
Ac(ons
with
CHACTL
QUERY
DIAGNOSIS

$ chactl query diagnosis -db oltpacdb -start 2016-10-28 01:52:50 -end 2016-10-28 03:19:15
2016-10-28 01:47:10.0 Database oltpacdb DB Control File IO Performance (oltpacdb_1) [detected]
2016-10-28 01:47:10.0 Database oltpacdb DB Control File IO Performance (oltpacdb_2) [detected]
2016-10-28 02:59:35.0 Database oltpacdb DB Log File Switch (oltpacdb_1) [detected]
2016-10-28 02:59:45.0 Database oltpacdb DB Log File Switch (oltpacdb_2) [detected]
Problem: DB Control File IO Performance
Description: CHA has detected that reads or writes to the control files are slower than expected.
Cause: The Cluster Health Advisor (CHA) detected that reads or writes to the control files were
slow because of an increase in disk IO.
The slow control file reads and writes may have an impact on checkpoint and Log Writer (LGWR) performance.
Action: Separate the control files from other database files and move them to faster disks or Solid
State Devices.
Problem: DB Log File Switch
Description: CHA detected that database sessions are waiting longer than expected
for log switch completions.
Cause: The Cluster Health Advisor (CHA) detected high contention during log switches
because the redo log files were small and the redo logs switched frequently.
Action: Increase the size of the redo logs.

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Cluster
Health
Advisor
–
Command
Line
Opera@ons

38

HTML
Diagnos(c
Health
Output
Available
(-‐html
ﬁle_name)

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Using
EMCC
for
Alerts
and
Correc@ve
Ac@ons

39

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
40

Using
the
CHA
GUI
to
Perform
Root-‐Cause
Analysis

Overview

•  Standalone
Java
GUI
Client

•  Must
be
run
on
local
cluster
node

•  Can
be
run
against
live
GIMR
or
MDB

(dump)
ﬁle

chactl export repository -format
mdb -start '2017-05-01 00:00:00'
-end '2017-05-10 00:00:00'
•  Used
internally
for
development

•  Will
be
available
and
maintained
on

Oracle
Technology
Network
soon.

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Calibra@ng
CHA
to
your
RAC
Deployment

•  Calibra@on
Goal:
Increase
sensi@vity
and
accuracy
with
suﬃcient
warning

•  Release
ships
with
conserva@ve
models
to
minimize
false
warnings

– DEFAULT_CLUSTER
for
each
cluster
node

– DEFAULT_DB
for
each
database
instance

•  Use
your
own
data
for
periods
of
“normal
opera@ons”
to
increase

sensi@vity

– Recommended
minimum
6
hour
period

– Should
include
all
normal
workload
phases
for
that
model

•  Models
may
be
changed
dynamically
online
using
CHACTL

41

Overview

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Calibra@ng
CHA
to
your
RAC
deployment

42

Choosing
a
Data
Set
for
Calibra(on
–
Deﬁning
“normal”

$ chactl query calibration –cluster –timeranges ‘start=2016-10-28 07:00:00,end=2016-10-28 13:00:00’
Cluster name : mycluster
Start time : 2016-10-28 07:00:00
End time : 2016-10-28 13:00:00
Total Samples : 11524
Percentage of filtered data : 100%
1) Disk read (ASM) (Mbyte/sec)
MEAN MEDIAN STDDEV MIN MAX
0.11 0.00 2.62 0.00 114.66
25 50 75 100 =100
99.87% 0.08% 0.00% 0.02% 0.03%
2) Disk write (ASM) (Mbyte/sec)
0.01 0.00 0.15 0.00 6.77
50 100 150 200 =200
100.00% 0.00% 0.00% 0.00% 0.00%
3) Disk throughput (ASM) (IO/sec)
2.20 0.00 31.17 0.00 1100.00
5000 10000 15000 20000 =20000
100.00% 0.00% 0.00% 0.00% 0.00%
4) CPU utilization (total) (%)
9.62 9.30 7.95 1.80 77.90
20 40 60 80 =80
92.67% 6.17% 1.11% 0.05% 0.00%

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Calibra@ng
CHA
to
your
RAC
deployment

•  Create
and
store
the
new
model

$ chactl query calibrate cluster –model daytime –timeranges ‘start=2016-10-28 07:00:00,
end=2016-10-28 13:00:00’
•  Begin
using
the
new
model

$ chactl monitor cluster –model daytime
•  Conﬁrm
the
new
model
is
being
used

$ chactl status –verbose
monitoring nodes svr01, svr02 using model daytime
monitoring database qoltpacdb, instances oltpacdb_1, oltpacdb_2 using model DEFAULT_DB
43

Crea(ng
a
new
CHA
Model
with
CHACTL

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Program
Agenda

High
Availability
Improvements

Con@nuous
Availability
Features

1

2

44

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Availability
for
applica@ons
–

Applica(on
Con(nuity

45

Availability
during

Planned
Maintenance

Con@nues
Availability

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Oracle
Real
Applica(on
Clusters
12c
Release
2

Con(nuous
Service
Availability

Real
Applica(on
Service
Levels

• 
Scales
PDBs
and
Services

• 
2
second
detec@on
on
EXA

• 
Recovery
in
low
seconds

• 
Drains
work
gradually

• 
Recovers
in-‐ﬂight
with
AC

“Always
Running”

47

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

• Recover
in-‐ﬂight
with

Applica@on
Con@nuity

• ADG
sessions
survive

standby
role
change

•  Drain
then
switchover,

AC
recovers
stragglers

Switchover to
db_resource_name [wait]

FAILOVER

Data
Guard

Observer

RAC
Primary
RAC
Standby

Site
A
Site
B

Oracle
Ac(ve
Data
Guard
12c
Release
2

Con(nuous
Service
Availability

48

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

§  Replays
in-‐flight
work
on
recoverable

errors

§  Masks
hardware,
sowware,
network,

storage
errors
and
@meouts

§  12.1

JDBC-‐Thin,
UCP,
WebLogic
Server,
3rd

Party
Java
applica@on
servers

§ 

OCI,
ODP.NET
unmanaged,
JDBC

Thin
on
XA,
Tuxedo,
SQL*Plus

§  RAC,
RAC
One,

Ac@ve
Data
Guard

In-‐flight
work
con(nues

Applica(on
Con(nuity

49

12.2

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

1
–
Normal
Opera(on

• Client
marks
database

requests

• Server
decides
which
calls

can

cannot
be
replayed

• Directed,
client
holds

original
calls,
their
inputs,

and
valida@on
data

2
–
Outage
Phase
1:

Reconnect

• Checks
replay
is
enabled

• Veriﬁes
@meliness

• Creates
a
new
connec@on

• Checks
target
database
is

valid
for
replay

• Uses
Transac@on
Guard
to

guarantee
last
outcome

50

3
–
Outage
Phase
2:

Replay

• Replays
captured
calls

• Ensures
results
returned

to
app
match
original

• 
On
success,
returns

control
to
the
applica@on

Under
the
Covers

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
51

Steps
to
use
Applica@on
Con@nuity

Check
What
to
do

Iden@fy
Requests

Return
connec(ons
to
pool
-‐
UCP,
WebLogic
Ac@ve
GridLink,

3rd
Party
Containers

using
UCP
,
OCI
Session
Pool,
ODP.NET
Unmanaged,
Tuxedo

JDBC
Deprecated

Classes

Replace
non-‐standard
classes
(MOS
1364193.1);

Use
AC
orachk
to
know

Side
Eﬀects
Use
disable
or
another
connec@on
if
a
request
should
not
be
replayed

Callbacks

UCP
and
WLS
–
with
labels
do
nothing.

12.2
set
FAILOVER_RESTORE=LEVEL1

Else
register
a
callback
for
applica@ons
that
change
state
outside
requests

Mutable
Func@ons
Grant
keeping
mutable
values,
e.g.
sequence.nextval

Copyright
©
2016,
Oracle
and/or
its
affiliates.
All
rights
reserved.

|

Run
the
AC
Assessments
52

How
effec@ve
is
Applica@on
Con@nuity
for
your
applica@on
?

Where
Applica@on
Con@nuity
is
not
in
effect
-‐
what
steps
need
to
be
taken
?

No
Steps
1
Analyze
and
Report
Coverage
2
Report
usage
of
deprecated

Java
Classes
Assessment
tool
input
output
Applica@o
n
traces

user
Out

put

orachk
read
h=ps://blogs.oracle.com/WebLogicServer/entry/using_orachk_for_coverage_analysis

52

Available
in
ORAchk

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

For
owned
sequences:

ALTER
SEQUENCE..
[sequence]

[KEEP|NOKEEP];

CREATE
SEQUENCE..
[sequence]

[KEEP|NOKEEP];

Grant
and
Revoke
for
other
users:

GRANT
[KEEP
DATE
TIME
|
KEEP
SYSGUID]
[to

USER]

REVOKE
[KEEP
DATE
TIME
|
KEEP
SYSGUID]
[from
USER]

GRANT
KEEP
SEQUENCE
on
[sequence]
[to

USER]
;

REVOKE

KEEP
SEQUENCE
on
[sequence]
[from
USER]

53

Grant
Mutables

Keep
original
func@on
results
at
replay

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Decide
if
any
requests
should
not
be

replayed

e.g.

Autonomous
Transac@ons

UTL_HTTP

UTL_URL

UTL_FILE

UTL_FILE_TRANSFER

UTL_SMTP

UTL_TCP

UTL_MAIL

DBMS_JAVA
callouts

EXTPROC

54

Don’t
Want
to
Replay

Disable
replay
for
requests
that
should
not
be
replayed

Use
another
connec(on

or
disable
API

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Conﬁgura@on

FAILOVER_TYPE
=
TRANSACTION
for
Applica@on
Con@nuity

FAILOVER_RESTORE
=
LEVEL1

for
common
states
restored
at
failover

AQ_HA_NOTIFICATIONS=True
for
FAN
with
OCI
driver
,
ODP.NET,
Tuxedo,
SQL*Plus

55

For
Java

Set
Service
A=ributes

Use a replay data source (local or XA)
replay datasource=oracle.jdbc.replay.OracleDataSourceImpl
For
OCI,
ODP.NET,
Tuxedo,
SQL*Plus

On when enabled on the service

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Killing
Sessions
-‐
Extended

DBA
Command
Replays

alter
system
kill
session
…
noreplay
BEST
METHOD

dbms_service.disconnect_session([service],
dbms_service.noreplay)
BEST
METHOD

srvctl
stop
service
-‐db
orcl
-‐instance
orcl2

-‐force
YES

srvctl
stop
service
-‐db
orcl
-‐node
rws3
-‐force
YES

srvctl
stop
service
-‐db
orcl
-‐instance
orcl2

–noreplay
-‐force

srvctl
stop
service
-‐db
orcl
-‐node
rws3
–noreplay
-‐force

alter
system
kill
session
…
immediate
YES

56

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

•  Complex
build
process

repeated
for
each
node

•  Error
prone

•  Longest
down-‐@me
and

maintenance
window

•  Have
to
create
backup

(no
built-‐in
fallback
plan)

•  How
do
you
enforce

standardiza@on?

•  Build
gold
image
once,

use
everywhere

•  Fewest
steps,
simplest

process

•  Shortest
down-‐@me
and

maintenance
window

•  Built-‐in
Fallback

•  Built-‐in
standardiza@on

•  Complex
build
process

repeated
for
each
node

•  Error
prone

•  Shorter
down-‐@me
and

maintenance
window

•  Built-‐in
Fallback

•  How
do
you
enforce

standardiza@on?

58

What
is
the
best
way
to
apply
maintenance?

1

2

3
1

2

3
1

2

Update
in
Place
Clone,
Update
and
Switch
Deploy
Gold
Image,
Switch

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

•  Driw
not
seen
un@l
scan

takes
places

•  Scanning
unchanged

targets
is
unnecessary

work

•  Does
not
prevent
driw

•  No
@me
lag
between
driw

and
alert

•  No
extra
work

•  Does
not
prevent
driw

59

•  Locked
conﬁgs
cannot
driw

•  Can
trigger
alert
if

unauthorized
changes

aQempted

•  Can
trigger
alert
if

authorized
changes
made

What
is
the
best
approach
to
handling
sowware
driw?

Scan

Trigger
Alert

Prevent

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Streamline
the
Distribu@on
Process

•  Ship
only
once

– To
a
customer,
to
a
site,
to
a
pool

•  Ship
to
interested
par@es
only

– Subscribers

•  Ship
only
what
is
necessary

– Updated
Modules,
Updated
Files,
Updated
Blocks

•  Deploy
non-‐disrup@vely

– Ship
any
@me,
choose
when
to
use
it

60

Customer

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
61

•  Simple

•  Prevent
errors,
enable
easy
correc@ons

•  Use
Gold
Images
for
all
scenarios

•  Enable
mass
opera@ons

on
1000s
of
nodes

Rapid
Home
Provisioning
and
Maintenance

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Build
Inventory
of
Gold
Images

62

Create
once
on
RHP
Server

Installed

homes

11.2.0.4.1
DB

12.1.0.2
Custom
RHP
Server

• 
Uptake
current
estate
by
promo(ng

exis(ng
homes
to
gold
images

• 
Create
new
homes
and
promote
to
gold

images
axer
valida(on

• 
Assign
states
to
images
for
lifecycle

management

GRID
11.2.0.4.3
WLS
12.2.1
• 
Oracle
internal

users:
import
image

from
GIaaS
Grid

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|
63

Supported
targets
and
environments

Manage
exis(ng
and
create
new
Pools,
Homes,
and
Databases

•  Patch
and
Upgrade
exis@ng
deployments

– No
pre-‐requisites
(conﬁg,
agent,
daemon…)
for
targets

– Database
and
Grid
Infrastructure
11.2.0.3,
11.2.0.4,
12.1.0.2,
12.2.0.1

• 
Provision,
Scale,
Patch
and
Upgrade
new
Clusters
and
Databases

– 11.2.0.4,
12.1.0.2,
12.2.0.1

•  Bare
metal,
VMs,
CDBs,
non-‐CDBs

•  SI
(standalone,
Restart,
Grid
Infr),
RAC
One,
RAC

•  Linux,
Solaris,
AIX

•  Generic
sowware
homes

Copyright
©
2016,
Oracle
and/or
its
aﬃliates.
All
rights
reserved.

|

Easy
to
create
Server,
start
managing
current
estate

•  RHP
Server
fully
self-‐contained

– Commodity
hardware
or
engineered
systems,
can
be
clustered
for
HA

– Enable
with
single
srvctl
command

– Lightweight
-‐
can
co-‐exist
with
other
func@ons

•  No
new
sowware
needed
on
targets

•  No
run-‐@me
dependency
between
Server
and
targets

64

Oracle RAC 12c Rel. 2 for Continuous Availability

Oracle RAC 12c Rel. 2 for Continuous Availability

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Oracle RAC 12c Rel. 2 for Continuous Availability

Similar to Oracle RAC 12c Rel. 2 for Continuous Availability (20)

More from Markus Michalewicz

More from Markus Michalewicz (20)

Recently uploaded

Recently uploaded (20)

Oracle RAC 12c Rel. 2 for Continuous Availability