This document discusses configuring Oracle Enterprise Manager Cloud Control 12c for high availability using Oracle Clusterware. It provides an overview of OEM 12c architecture and the different levels of high availability. It then focuses on a level 2 active/passive configuration where the OMS binaries are installed on shared storage and fail over between nodes is enabled using a virtual IP address. The steps shown include Oracle Clusterware setup, OEM installation, configuration of the management repository, and adding the OMS as a Clusterware resource for automated failover.
2. 2
About Me
• Oracle DBA for 10+ years
• Oracle ACE
• Oracle Certified Expert: RAC and Grid
Infrastructure 11gR2
• Co-Author Expert Oracle Enterprise Manager
Cloud Control 12c – Apress 2013
3. See Me Speak at COLLABORATE 14 –
IOUG Forum
■ 5 days of training with more than 5,500 expert Oracle users
■ List speaking sessions
■ User-driven training in:
▪
▪
▪
▪
▪
Big Data
BI
Cloud Computing
Database Performance
Database Development
▪
▪
▪
▪
Engineered Systems
High Availability
OEM
Security
■ Learn more at collaborate14.ioug.org
4. COLLABORATE 14 – IOUG Forum
■ 5 days of training with more than 5,500 expert Oracle users
■ Case studies and sessions cover the latest in:
▪
▪
▪
▪
▪
Big Data
BI
Cloud Computing
Database Performance
Database Development
▪
▪
▪
▪
Engineered Systems
High Availability
OEM
Security
■ Learn more at collaborate14.ioug.org
8. 8
OEM High Availability Levels
Level
1
2
3
4
Description
Load Balancer Required
OMS and Repository on separate hosts.
No
No redundancy.
OMS installed on shared storage with VIP used in
No
active/passive failover.
Repository Database using local Data Guard.
Multiple OMSs in active/active configuration.
Yes, At primary site
Repository using RAC Database with local Data Guard
Primary OMS in active/active configuration with RAC Yes. At pimary and standby
Repository database. Standby OMS at DR site in
sites
active/active configuration. Standby RAC database at
DR site
Cost
$
$$
$$$
$$$$
9. 9
Level 2 – Active/Passive HA
• A minimum of 2 Servers Required
• OMS binaries installed on shared filesystem
• NFS/OCFS2/DBFS/ACFS
• OMS can run on one node in cluster at any given
time
• Data Guard for Management Repository
11. 11
Level 2 – Active/Passive
OMS
Setup virtual
hostname and
IP address
(VIP)
• Clusterware VIP
• Virtual
hostname
should resolve
to unique IP
Install OMS
on shared
disk
Create
Clusterware
resource for
OMS
Failover
12. 12
Oracle Clusterware Setup
• Clusterware can be used to create/manage VIP
• 11.2+ uses appvipcfg for VIP
<GRID_HOME>/bin/appvipcfg create -network=1
-ip=192.168.1.0
-vipname=omsvip
-user=root
• VIP can be created on non-default network
• In Oracle 12c Flex Clusters app vips can be
created on leaf nodes
• Allow Oracle Grid Infrastructure software owner
(e.g. grid) to run the script to start the VIP.
<GRID_HOME>/bin/crsctl setperm resource omsvip -uuser:grid:r-x
13. 13
Oracle Clusterware Setup
• Start the VIP as the GI owner e.g. grid
<GRID_HOME>/bin/crsctl start resource omsvip
• Check the status of the VIP
<GRID_HOME>/bin/crsctl status resource omsvip
The status of the output should be similar to the following:
NAME=omsvip
TYPE=app.appvip_net1.type
TARGET=ONLINE
STATE=ONLINE on oms1
• View full configuration with -f
<GRID_HOME>/bin/crsctl status resource omsvip –f
14. 14
Oracle Clusterware Setup
• Check if virtual hostname and VIP are resolvable
nslookup <omsvip>
• Also do a reverse lookup of the IP address.
nslookup <virtual IP address>
• Verify that the IP address returned from the
nslookup output is running on the OMS host.
ifconfig –a | grep <virtual IP address>
15. 15
OEM Installation
• Create ORACLE_HOME for the OMS on the shared
storage on all nodes in the cluster
mkdir –p /u01/app/oms_share
• Create Oracle Inventory directory under
ORACLE_HOME for the OMS on all nodes
mkdir /u01/app/oms_share/oraInventory
• Create the inventory pointer in the oraInventory
directory
vi oraInst.loc
inventory loc=/u01/app/oracle/oms_share/oraInventory
inst_group=oinstall
17. 17
OEM Failover
• To manually relocate the VIP to another host in
the cluster issue the following command.
$crsctl relocate res omsvip
CRS-2673: Attempting to stop 'omsvip' on 'oms1'
CRS-2677: Stop of 'omsvip' on 'oms1' succeeded
CRS-2672: Attempting to start 'omsvip' on 'oms2'
CRS-2676: Start of 'omsvip' on 'oms2' succeeded
• Check if the IP address associated with the VIP
is running on the relocated host.
ifconfig –a|grep<vip>
18. 18
OEM Failover
•
•
•
•
Establish IP on failover server (done through Clusterware)
Start listener (if part of same failover group)
Start database (if required)
Set the ORACLE_HOSTNAME environment variable to the
virtual hostname. Continuing with our example we use the
command below.
export ORACLE_HOSTNAME=omsvip.example.com
• Start the OMS on the new node
$OMS_HOME/bin/emctl start oms
19. 19
Add OEM Clusterware
resource
• OEM can be added as Clusterware resource
• Administrator managed (static, 2 – nodes)
• Policy managed (dynamic, > 2 nodes)
• Handled through Agent Framework
• C/C++
• Create Action Script
• Store on shared storage
• Specify START/STOP/CHECK/ABORT routines
38. 38
Management Repository
• Use OEM to Create Standby Database!
• Can be configured using Data Guard with FastStart-Failover
• Also can be configured with RAC/RAC One Node
• Storage becomes SPOF
• Listeners should be reachable from all nodes in
cluster
• Database should be started before starting OEM
39. 39
Configure OEM with Data Guard
Repo
Create database services for fast failover.
• srvctl add service -demrep -soemsvc -l
PRIMARY -q TRUE -e SESSION -m BASIC w 10 -z6
• srvctladd service -d emrep2 -soemsvc -l
PRIMARY -q TRUE -e SESSION -m BASIC w 10 -z 6
43. 43
Caveats
•Shared File System becomes SPOF for OMS
•Incurred downtime between failover (minutes)
•May require additional licenses if Data Guard is used
43
45. 45
References
• How to Configure Grid Control OMS in
Active/Passive CFC Environments failover / HA (Doc
ID 405642.1)
• How to Configure the OMS Connect String when
Repository is in a Dataguard setup [ID 1328768.1]
• Oracle® Clusterware Administration and Deployment
Guide12c Release 1
• How To Configure Grid Control Components for High
Availability Note 406014.1
• Oracle Clusterware 11gR2 Whitepaper
http://www.oracle.com/technetwork/database/cluster
ware/overview/oracle-clusterware-11grel2-owp-1129843.pdf
Editor's Notes
Released in 10gR1 for RAC DATABASESIndependent Clusterware InfrastructureProtects any kind of application ASM + Clusterware = Grid Infrastructure
Each component within the Enterprise Manager architecture should be made highly available to enable a complete High Availability configuration. The main components to be considered are (see Figure 13-1):Enterprise Manager Agent - Communicates with and sends metrics to the OMSManagement Server - the heart of the Enterprise ManagerRepository -Stores persistent data from the monitored targets. Software library - Stores files for patching, provisioning and agent and plug-in deployment
Different levels of high availability can be configured for each component with varying levels of complexity and cost. When considering your high availability requirements, there should be minimal trade-offs in cost, complexity, performance and data loss. Generally, the complexity and level of high availability are proportional to each other.
Also called a cold-failover cluster (CFC)In order to reduce OMS downtime during planned or unplanned outage, some redundancy should be introduced into the configuration. A level 2 configuration uses a shared filesystem for the management service to achieve an active/passive or cold failover cluster solution. The filesystem is shared between two or more hosts and is only active on one host at a time. The shared filesystem for the OMS can be installed on a general-purpose cluster file system including NFS, Oracle Cluster File System (OCFS2), and Oracle Automatic Storage Management (ASM) Cluster File System (ACFS). If NFS is used as the shared storage, then ensure the correct mount options are set in /etc/fstab (/etc/filesystems on AIX) to prevent potential I/O issues. Specifically, the rsize and wsize should be set.The example below shows and entry in the /etc/fstab file on a Linux server where the NFS share is mounted on a filer named filer1 under the /vol1/oms_share directory.filer:/vol1/oms_share /u01/app/oms_share nfsrw,bg,rsize=32768,wsize=32768,hard,nointr,tcp,noac,vers=3,timeo=600 0 0Binaries for the OMS along with the inventory should be installed on the shared filesystem. Setup the virtual hostname and IP address (VIP) using Oracle Clusterware or third-party software and hardware. Failover is achieved by using the virtual hostname for the OMS along with a unique IP address which resolves to the hostname.
Also called a cold-failover cluster (CFC)In order to reduce OMS downtime during planned or unplanned outage, some redundancy should be introduced into the configuration. A level 2 configuration uses a shared filesystem for the management service to achieve an active/passive or cold failover cluster solution. The filesystem is shared between two or more hosts and is only active on one host at a time. The shared filesystem for the OMS can be installed on a general-purpose cluster file system including NFS, Oracle Cluster File System (OCFS2), and Oracle Automatic Storage Management (ASM) Cluster File System (ACFS). If NFS is used as the shared storage, then ensure the correct mount options are set in /etc/fstab (/etc/filesystems on AIX) to prevent potential I/O issues. Specifically, the rsize and wsize should be set.The example below shows and entry in the /etc/fstab file on a Linux server where the NFS share is mounted on a filer named filer1 under the /vol1/oms_share directory.filer:/vol1/oms_share /u01/app/oms_share nfsrw,bg,rsize=32768,wsize=32768,hard,nointr,tcp,noac,vers=3,timeo=600 0 0Binaries for the OMS along with the inventory should be installed on the shared filesystem. Setup the virtual hostname and IP address (VIP) using Oracle Clusterware or third-party software and hardware. Failover is achieved by using the virtual hostname for the OMS along with a unique IP address which resolves to the hostname.
It’s recommended to use the appvipcfg utility in Oracle Clusterware 11gR2 to create application VIPs. The VIP is created with a set of pre-defined settings suitable for an application VIP such placement policy and the failback option.The default value of the failback is set to 0, which means that the VIP and its dependent resources will not automatically fail back to the original node once it becomes available again. VIP can be created on the non-default network (default ora.net1.network)Non-default network should be created with srvctl add network command
A VIP can be created in the same was any other Clusterware resource. However, it is recommended to use the appvipcfg utility in Oracle Clusterware 11gR2 to create application VIPs. The VIP is created with a set of pre-defined settings suitable for an application VIP such placement policy and the failback option.The default value of the failback is set to 0, which means that the VIP and its dependent resources will not automatically fail back to the original node once it becomes available again.Use the –f option to view all attributes
The virtual hostname is defined in DNS, and should resolve to the application VIP address created using the steps above. Check if the virtual hostname and VIP are resolvable using nslookup or the dig command.$ nslookup omsvipThis should resolve to a unique IP address of the virtual hostname on every node in the cluster.Also do a reverse lookup of the IP address.$nslookup <virtual IP address>Verify that the IP address returned from the nslookup output is running on the OMS host.ifconfig –a|grep <virtual IP address>
Install the OMS on the first host by following the installation steps as described in the Oracle Enterprise Manager Cloud Control 12c Basic Installation Guide. You only need to complete the installation once. Since the location is shared, the binaries will be accessible from another host that shares the filesystem.
The ORACLE_HOSTNAME should be also set when starting the OEM on each node in the cluster. This should be the same as the Virtual Hostname defined in DNS for the VIP.
Once the OMS has been successfully installed and is up and running, if the host were to go down then the VIP will be automatically relocated to another node. The management service can then be manually started on any remaining node in the cluster on which the VIP is running.
Oracle Clusterware can be configured to fully manage the OMS by creating start, check, stop, clean and abort routines that tell it how to operate on the OMS.
You must decide whether to use administrator or policy management for the application. Use administrator management for smaller, two-node configurations, where your cluster configuration is not likely to change. Use policy management for more dynamic configurations when your cluster consists of more than two nodes. For example, if a resource only runs on node 1 and node 2 because only those nodes have the necessary files, then administrator management is probably more appropriate.An action script is a shell script (a batch script in Windows) that a generic script agent provided by Oracle Clusterware calls. An application-specific agent is usually a C or C++ program that calls Oracle Clusterware-provided APIs directly.
To add the OEM server as a resource that uses a named server deployment, assume that you add the resource to a server pool that is, by definition, a sub-pool of the Generic server pool. You create server pools that are sub-pools of Generic using the crsctl add serverpool command. These server pools define the Generic server pool as their parent in the server pool attribute PARENT_POOLS. In addition, they include a list of server names in the SERVER_NAMES parameter to specify the servers that should be assigned to the respective pool.
generic_application cluster type can model any application requiring high availability without specifying action scripts
Install OEM using virtual hostname/VIP and then create OEM Cluster resources using OEM.
Use generic_application resource type in 12c to model any application. No action scripts are required with this resource type.Starting in Oracle Clusterware12c you can add application resources to hub or leaf nodes.
Modify script/start/stop timeout values since the default maybe too low for OEM.
Additional start/stop dependencies including ACFS/Filesystems, Listeners etc. can be added.
Manage OEM tasks (start/stop) using Clusterware. Using OEM to start and stop OEM may cause inconsistencies in Clusterware resourceDisable upstart scripts (Linux) or services in Microsoft Windows.