This document provides an overview of Flex Clusters and Flex ASM in Oracle databases. It defines Flex Clusters as having both hub nodes, which have direct access to shared storage similar to standard clusters, and leaf nodes, which do not require direct storage access. It describes how Flex ASM allows Oracle ASM to run in a Flex Cluster environment across multiple ASM instances. It also provides instructions on converting existing clusters and ASM configurations to the Flex model.
3. Daniel Da Meda
•18 Years of experience using Oracle Technologies
•7 Years working as a Senior DBA in the UK providing services for a variety of
companies like the TimeWarner Group, Johnson&Johnson, Network
Rail and John Deere
•2 Years working as a Senior DBA in Angola for the Ministry of Finance
•Photographer Enthusiast
http://thatoracledude.blogspot.com/
@ddameda
daniel.da.meda@hotmail.com
http://www.guoa.org/
9. Good resource information about
Oracle RAC
Markus Michalewicz
@OracleRACpm
Director of Product Management, Oracle Real
Application Clusters
10. Clarifying
“The Oracle Database with the Oracle Real Application Clusters (RAC) option allows
running multiple database instances on different servers in the cluster against a
shared set of data files, also known as the database.”
“An Oracle RAC database requires Oracle Clusterware to be installed on the system
prior to installing an Oracle RAC enabled database home. Oracle Clusterware (OCW) is
an integral component of the Oracle Grid Infrastructure (GI) suite of products, which
includes Automatic Storage Management (ASM) and the Oracle Cloud File System
(CloudFS).”
“Oracle Clusterware is a technology transforming a server farm into a cluster. A cluster
is defined as a group of independent, but connected servers, cooperating as a single
system. Oracle Clusterware is the intelligence in this system that provides the
cooperation.”
11. Standard Cluster
Servers accessing the
storage
Components:
• Grid Infrastructure
• ASM
• Listener(s)
• Database(s)
12. Block, Row and Cache Fusion
• Another process wants to change another row of the same block on
another instance
Instance 1
Buffer Cache
Row 1
Row 2
Row 3
Row 4
Exclusive Consistent Current
Read
Block
Instance 2
Update row2…
Buffer Cache
Row 1
Row 2
Row 3
Row 4
Block
•The current version of the block is needed
•The most current version of the block is not
on disk. So, we need to get it from another
Buffer Cache
•Once the block has been changed, let GCS
know the state and mode of such block
Fusion
LMS
LMS
Who’s the Master?
Global Cache Service GCS
Instance 1 is
Global Resource Directory GRD
SCN=10
Database
SCN=20
SCN=30
11/11/2014 Cache Fusion at Work - Daniel Da Meda 12
14. Flex Cluster - Nodes
• Hub Nodes Similar to Nodes on a Standard Cluster.
• Leaf Nodes Do not require direct access to the
shared storage. It acts pretty much as an application
server node. It cannot host a RAC database instance,
although it could host a single instance database on
local storage.
Usage Examples: HS Services Gateway, General
Applications that do not require direct access to the
Shared Storage
17. So, What can I run on a Leaf Node ?
- Golden Gate
- JAVA Applications
- HS Gateway
18. Cluster Limits - Doc ID 220970.1
63 nodes using Oracle Clusterware (Oracle 9i or Oracle RAC
10g Release 1)
Technically and since Oracle RAC 10g Release 2, 100 nodes are
supported in one cluster. This includes running 100 database
instances belonging to the same (production) database on this
cluster, using the Oracle Database Enterprise Edition (EE) with
the Oracle RAC option and Oracle Clusterware only (no third
party / vendor cluster solution underneath).
255 nodes using OCFS2
In 12c up to 2000 nodes!
20. Flex Cluster
• If the shared storage is presented to a Leaf Node,
it can be turned into a Hub Node
• A Standard Cluster can be converted to a Flex
Cluster at any time. Well, almost.
• Once converted to Flex, it is not possible to
convert it back to Standard
21. Leaf Nodes
$ crsctl query css votedisk
CRS-1668: operation is not allowed on a Leaf
node
$ ocrcheck
PROT-605: The 'ocrcheck' command is not
supported from a Leaf node.
24. Grid Naming Service - GNS
• If you also dislike it, you better start, at least,
start sympathizing with it.
– We don’t like it too. Please don’t tell anyone!
• The use of GNS is required in Flex Clusters.
27. Conversion – Cluster Mode
• GNS – It is not possible to convert if you are not using GNS
• No DNS used
• Flex ASM is configured, if not, convert standard ASM to Flex ASM
• If you are upgrading to 12c RAC, you can’t upgrade a standard
cluster directly to a flex cluster. You will have to upgrade to 12c
standard cluster first and then convert the standard cluster to flex
cluster.
Check RACAttack’s Lab on Flex Cluster and Flex ASM Conversion!
28. Conversion – Cluster Mode
Check Cluster Mode with ASMCMD
[oracle@rac1 bin]$ asmcmd showclustermode
ASM cluster : Flex mode enabled
OR with CRSCTL
$ crsctl get cluster mode status
Cluster is running in "standard" mode
Converting
$ crsctl set cluster mode flex
$ crsctl stop crs
$ crsctl start crs –wait
* use [-wait] option to display progress message
* Remember, Flex Clusters can’t be switched to Standard Clusters
29. Conversion of Nodes Hub/Leaf
Check:
grid@rac1 ~]$ crsctl get node role config
Node 'rac1' configured role is 'hub'
Convert:
crsctl set node role {hub | leaf}
crsctl stop crs
crsctl start crs –wait
30. ASM Filter Driver - ASMFD
• Acts as a replacement for ASM library driver (ASMLIB)
• Resides in the I/O path of the Oracle ASM disks
• ASM uses the filter driver to validate write I/O requests to
ASM disks
• Simplifies configuration and management of disk devices by
eliminating the need to rebind them with ASM following a
system restart
• Rejects any I/O requests that are invalid, eliminating
accidental overwrites to ASM disks
• It is optional. after installation of Grid Infrastructure you
may configure ASMFD
• If ASMLIB is already configured, then you must explicitly
migrate it to ASMFD
31. Configure ASMFD
1. As the GI owner, update ASM disk recovery string to enable ASMFD to discover devices
$ asmcmd dsget
parameter:
profile:
2. As the GI owner list the nodes and node roles in your cluster:
$ olsnodes -a
12crac1 Hub
12crac2 Hub
3. As root on each node, stop CRS
# crsctl stop crs
4. As root on each node, configure ASMFD
# asmcmd afd_configure
32. Configure ASMFD
5. As the GI owner, verify the status of ASMFD
$ asmcmd afd_state
ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'ENABLED' on
host ‘12crac1‘
6. As root, start CRS on the node:
# crsctl start crs
7. As the GI owner, set the ASMFD discovery string to the original
value retrieved in step 1
$ asmcmd afd_dsset old_diskstring
33. Migrating ASM Disks to ASMFD
1. As the GI owner, list the existing groups
$ asmcmd lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Name
MOUNTED EXTERN N 512 4096 1048576 10228 1262 DATA/
MOUNTED EXTERN N 512 4096 1048576 10228 9894 FRA/
2. List the associated disks:
$ asmcmd lsdsk -G DATA
/dev/oracleasm/disks/ASM1
/dev/oracleasm/disks/ASM2
3. Check if the ASM instance is active
$ srvctl status asm
ASM is running on 12crac1,12crac2
4. Stop the databases and dismount the diskgroup on all nodes:
$ srvctl stop diskgroup –diskgroup +DATA –f
34. Migrating ASM Disks to ASMFD
5. From a HUB node as the GI owner, Label all existing disks in the DG
$ asmcmd afd_label disk_path label –migrate
6. Scan the disks on all HUB nodes as the GI owner
$ asmcmd afd_scan
7. Start the databases and mount the DG on all nodes
$ srvctl start diskgroup -diskgroup +DATA
36. ASM Cardinality
• The number of ASM instances running in a
given cluster is called ASM cardinality with a
default value of 3. However the cardinality
value can be amended using the following
Clusterware command.
crsctl status resource ora.asm -f | grep CARDINALITY
CARDINALITY=3
46. ASM Network Listener
[oracle@rac3 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
ONLINE ONLINE rac1 STABLE
ONLINE ONLINE rac2 STABLE
ONLINE ONLINE rac3 STABLE
51. ASM Client Relocation
Clients are automatically relocated to another
instance if an Oracle ASM instance fails. If necessary,
clients can be manually relocated using the ALTER
SYSTEM RELOCATE CLIENT
SQL> ALTER SYSTEM RELOCATE CLIENT 'client-id';
Client-id is of the form instance_name:db_name
The INSTANCE_NAME and DB_NAME columns are
contained in the V$ASM_CLIENT view.
52. Flex ASM enhancements
•Larger LUN size support, up to 32PB
•Supports up to 511 Disk Groups
•Supports renaming ASM disks and Disk Groups
53. ASM Password File
In releases prior Oracle 12c the password file,
created by orapwd utility, was created in the
$ORACLE_HOME/dbs directory by default and thus
was local to the node and instance. This required
manual synchronization of the password files. If the
password file became out of sync between
instances, it could cause inconsistent login behavior
In Oracle 12c (for new installations), the default
location of the password file is in ASM.
55. Converting to Flex ASM – Silent Mode
asmca -silent -convertToFlexASM -asmNetworks
eth1/10.154.138.0 -asmListenerPort 1521
/u01/app/oracle/cfgtoollogs/asmca/scripts/conv
erttoFlexASM.sh
*Cannot convert back to standard ASM cluster
from Oracle Flex Cluster.
56. Flex ASM 12c (12.1) and Extended Rac:
be careful with “unpreferred” reads !
http://bdrouvot.wordpress.com/2013/07/02/fle
x-asm-12c-12-1-and-extended-rac-be-careful-to-unpreferred-
read/
57. ols.pl - A script to display GI/RAC
cluster info (Doc ID 1568439.1)
Oracle Clusterware command olsnodes displays
node number/status/type/VIP, this is useful but
not enough. To clearly display a node's
functional role in 12c, including but not limited
to CRSD PE Master/Standby, CTSS/UI/ONS
Master as well as Leaf-to-Hub connectionship
etc, ols.pl is written to automatically present
these info in a clear and handy way.
If we are going to talk about clusters, it is always good to have a good understanding about the two major components of an Oracle Cluster.
The Grid Infrastructure and RAC itself
This slide is a good way of demonstrating or gathering the knowledge of the audience on the subject.
It is a good slide to tell the difference between the Grid Infrastructure Clusterware and Oracle RAC, which is an option that allows…
HUB and LEAF topology
Tightly coupled and loosely coupled server membership
Implicit Server Pools
Highly Scalable
So, up to this point we had the Standard Cluster. And it is still supported. Actually, as we will find out, a Flex Cluster will have nodes that behave pretty much the same way as the nodes in a Standard Cluster. These are referred as the HUB nodes now.
This is a vision of a normal cluster with its core components:
Detailing some Components of an Oracle Standard Cluster:
Grid Infrastructure: As we know, back in 10g, This used to be a single component – Oracle Clusterware. Oracle Clusterware, since its introduction, was installed on a separate home pretty much like today. However, ASM was not part of it and had to be installed with the Oracle Database Installer and you had the choice of sharing the Oracle Database home or placing the ASM software on a different home.
Oracle 11gR2 was a major milestone for Clusterware as it was combined with ASM sharing the same home and more tightly integrated. It was the birth of Oracle Grid Infrastructure. Over the years, Oracle Clusterware became a general-purpose clustering solution for all applications.
ASM: Oracle ASM is a volume manager and a file system for Oracle Databases files. It supports both single instance and RAC databases. It is the recommended storage management solution as it provides an alternative to conventional volume managers, file systems and raw devices. The performance is comparable with raw devices. It can coexist with other storage management options such as raw disks and third-party file systems. It also offers 2 and 3-way mirroring so that you are protected against disk failures. It does this by making sure extents are placed on different failure groups in a Disk Group
ACFS: ASM in its basic form, only allows databases files to be stored on it. However, ACFS – Oracle Automatic Storage Management Cluster File System, is build on existing disk groups and are a multi-platform files system to support all customers files including Oracle Database Homes from 11.2 and later. It does not support placing the Grid Infrastructure Home in ASCFS. ACFS on Oracle Grid Infrastructure 12c supports placing the database files from a database 11.2.0.4 and above.
Listeners: Each Node will have a listener that will serve the databases running on that node. However, if you are using SCAN, there will also be a SCAN Listener listening on SCAN VIPs that will load balance and redirect incoming connections to a local Listener. Local instances with register with its local listener as well as with the SCAN listeners.
General Properties that made Oracle Clusters a successful technology
These simple properties among many others have made Oracle RAC a very successful clustering solution
Tightly Coupled Server Membership (nodes): Use of network & disk heart beat to monitor nodes. All nodes communicate with all nodes in the cluster. Essential to monitor and detect split-brain conditions.
Does not allow split-brain
Split Brain Syndrome Explained: If nodes are not coordinating who does what with a block, let’s say through cache fusion, one block could be get, read and updated in one instance and other instances would not be aware of a lock on this block. In this situation, another instance could well get the same block from disk and update it causing a major integrity issue.
GCS = Global Cache Service
GRD = Global Resource Directory
Refer to the Cache Fusion Slide which is the next slide
So, what mechanisms does Oracle use to void Split Brain Syndrome?
1) Have redundant interconnect network interfaces. HAIP was introduced in 11.2.0.2 to make it easier to present multiple NICs to the interconnect for both fault tolerance and load balancing without having to bond the NICs at the OS level. Uses Voting Disk to determine split survivor. Voting disks provide an additional communication path for the cluster nodes in case of problems with the interconnect. See HAIP above.
2) It uses the voting disks as a secondary protection. Once there is an interconnect failure, surviving nodes realise it and will make use of the voting disks to check which nodes managed to see which nodes. Refer to the Voting Disk Slide So, let’s say, in a three node configuration, node 1 suffers an interconnect failure. Nodes 2 and 3 check the heart beats in a voting disk or file and realise that they can see each other but they both flagged that it is not possible to reach node 1. The voting disk or file also shows that it could only flag itself. In this situation, a poison pill is left in the voting file so that node 1 can read it and, basically, kill itself.
Any Subset of servers can form a cluster
No need for a minimum number of members:
Portable: No OS kernel changes required
Fault Tolerant:
Supports Rolling Upgrades
Oracle Clusterware 11g Standard Cluster Requirements
Shared Storage
Public and Private Networks
SCAN (Single Client Access Name) – 3 IPs which in turn will translate into 3 Vips and SCAN Listeners
Node VIPs – 1 per node. Local Listener listens on this as well as the public IP
DNS or Host resolution
Multicast Resolution
Root/sudo access
SSH or RSH
Pre-12c New Features
Some of the features that have been introduced in pre-12c versions that made a big impact and enabled the Oracle Clusterware to be used as a multipurpose clustering solution
Server Pools: You could partition the cluster controlling role separation (ACLs) and resource allocation (Policy Management) instead of the administrator based policies
Role Management
ACFS
Services: At the database tier. Abstracting connectivity
New in Grid Infrastructure 12c
In 12c many new features have been introduced and we are going to focus our attention to Flex Clusters and Flex ASM but there are many other new features that fall out of the scope for this specific presentation
Full Oracle Multitenant Support
Server Categories
What-if Analysis*
HA-NFS*
Flex ASM
Cluster Policy Sets
Application Agents
Generic Agent*
Oracle QoS Mgt Enhancements
Bare Metal + OVM Cluster
Flex Cluster*
OUI Enhancements
Cloud Control 12c Integration
IPv6 Support
Management Repository Database: Primarily used to hold data related to cluster health monitoring, workload management, memory guard and QoS . For new installations, it defaults to Yes. If I decide not to use a management repository database
Utility Server
Multi-Cluster GNS
Multi-Subnet SCAN
OVM Template Enhancements
Grid Infrastructure Licensing
Changes specific to Clusterware and ACFS. It can be downloaded and used free of charge to protect any application in any environment. However, if support or patching or upgrades are required, you will need to have some licensed product in the stack in order to have a CSI for support.
As for ASFS, the same is true. The basic features are available free of charge. However, if you intend to use some of the more advanced features like snapshots or tagging, it will be a priced product.
Hub Servers
Tightly coupled. Flex ASM is what is new to this.
Standard Cluster Properties as in pre-12c
Dedicated to Database Service
It can run Flex ASM instances
Leaf Servers
No database, No ASM, footprint CPU very slow
Light-weight Server
No Shared Storage
Dedicated to Applications and Application servers. So, all of a sudden you bring two tiers into the same cluster!
No peer2peer communication. There is a communication with a HUB node for failure notification
Supports Intra-Tier and Inter-Tier Dependencies: Management of cluster resource dependencies between the Database Tier and the Application Tier
Supports Bare Metal Servers and OVM Servers in common Clusters: Bare Metal for the Database tier and OVM for the leaf tier
This topology allows clusters to be scalable to thousands of nodes.
They talk to a HUB node. They do not talk to another Leaf Node
Hub Tier
We see that the RAC Pool, which is the HUB tier is a four node cluster with the Oracle Database.
We are only running three ASM instances. The fourth ORCL instance will get its extent map serviced remotely via an ASM network that will be described later on. It will get remote service for the extent map from the pier nodes in the cluster that actually have ASM instances running on it. What is particularly interesting at this point is the ability for 12c databases instances about this is the ability the database instance to continue up even if the ASM instance has failed.
This is possible because there is a new network on the block and it is called the ASM network.
Leaf Tier
Processing over the network is essentially just for heart beating to a HUB node in the cluster.
There is no public network defined by default. There is a secondary network that is configured exclusively for use in the Leaf Tier. You can create new networks that are exclusive for the leaf tier and VIPs on top of that network so that you can install applications that can failover from Leaf Node to Leaf Node using these VIPs.
MGMTDB é uma nova instância, utilizada para guardar os dados do Cluster Health Monitor (CHM). Na versão 11G, estes dados eram armazenados em um banco Berkeley DB em $GRID_HOME/crf/db/hostname
During the installation of Oracle Grid Infrastructure 12.1.0.1 you've had the following option to choose YES/NO to install the Grid Infrastructure Management Repository (GIMR) database MGMTDB
With Oracle Grid Infrastructure 12.1.0.2 this choice has become obsolete and the above screen does not appear anymore. The GIMR database has become mandatory.
Note that Oracle supports using GNS without DHCP or zone delegation in Oracle 12c (as with Oracle Flex ASM server clusters, which you can configure without zone delegation or dynamic networks).
You can use GNS without DNS delegation in configurations where static addressing is being done, such as in Oracle Flex ASM or Oracle Flex Clusters. However, GNS requires a domain be delegated to it if addresses are assigned using DHCP.
Oracle does not support using GNS without DHCP or zone delegation on Windows.
This is the screen where you define the GNS while you are installing Grid Infrastructure
Processing over the network is essentially just for heart beating to a HUB node in the cluster.
There is no public network defined by default. There is a secondary network that is configured exclusively for use in the Leaf Tier. During installation, when specifying cluster node roles, it does not require a VIP for LEAF nodes as it does for HUB nodes.
No DNS used: This means that you do not have to make changes once the initial delegation has been configured. This is for conversion
Nothing prevents you from converting a HUB node to a LEAF node
Rejects any Invalid I/O: It eliminates situations where wither the DBA or the SA end up overwriting an ASM disk by mistake
ASMFD is installed with na Oracle Grid Infrastructure installation. If you have na existing Oracle ASM library driver (ASMLIB) configuration, then depending on whether you want to use ASMLIB or ASMFD, consider the following scenarios:
If you use ASMLIB to manage your ASM devices and you want to continue to use ASMLIB, then upgrade to Oracle Grid Infrastructure 12c Release 12.1.0.2. In this case Although ASMFD is installed, ASMLIB continues to be used for device persistence.
If you use ASMLIB to manage your ASM devices and you want to migrate to ASMFD, then perform the following steps:
Upgrade to Oracle Grid Infrastructure 12c Release 1 (12.1.0.2): This process installs ASMFD
Migrate from ASMLIB to ASMFD: Next slide will demonstrate hout to remove ASMLIB and configure devices to use ASMFD for device persistence.
If ASMLIB is installed but you do not use ASM because Grid Infrastructure is not installed and you want to use ASMFD, follow these steps:
Deinstall ASMLIB
Install Oracle Grid Infrastructure 12c Release 1 (12.1.0.2)
Configure your ASM devices to use ASMFD for device persistence
Create disk labels to enable migration of ASM disk groups to ASMFD
The value for the old_diskstring is the current Oracle ASM disk discovery string value.
Note that you must be using an SPFILE for the ASM instance.
The value for the old_diskstring is the current Oracle ASM disk discovery string value.
Note that you must be using an SPFILE for the ASM instance.
The value for the old_diskstring is the current Oracle ASM disk discovery string value
Note that the procedure for moving ASM DGs containing OCR or Voting Files is different:
1. Log in as the root user and list the disk groups with OCR and voting files by running the following commands on one node:
# $ORACLE_HOME/bin/ocrcheck -config # $ORACLE_HOME/bin/crsctl query css votedisk
2. As the Oracle Grid Infrastructure owner list the disks associated with the disk groups:
$ $ORACLE_HOME/bin/asmcmd lsdsk -G disk_group
3. As root, stop the databases and Oracle Clusterware on all nodes:
# $ORACLE_HOME/bin/crsctl stop cluster -all
4. As the Oracle Grid Infrastructure owner label all existing disks in the disk group by running the following command for each disk on a Hub node:
$ $ORACLE_HOME/bin/asmcmd afd_label disk_path label
5. As the Oracle Grid Infrastructure owner rescan the disks on all Hub nodes by running the following command on all of the Hub nodes:
$ $ORACLE_HOME/bin/asmcmd afd_scan
6. As root, start the Oracle Clusterware stack on all nodes and mount the OCR and voting files disk groups and databases:
# $ORACLE_HOME/bin/crsctl start cluster -all
Flex ASM is enabled if you choose Flex Cluster.
ASM instance runs on Hub nodes only as Hub nodes have access to the shared storage
The advantage is that you can save on memory and cpu for large RAC configurations
By default a flex ASM configuration will run an ASM instance on every node for clusters with 3 or fewer nodes. For clusters with 4 or more nodes, only three of the nodes will have an ASM instance
– Limitation of the standard ASM
• Each node has an ASM instance which costs CPU/memory
• Local ASM instance failure will cause DB instance failure
– Flex ASM is an option on Oracle 12c : enabled or disabled.
• A small # of ASM instance. (default 3, specified by admin)
• DB instances connects to any ASM instance (local/remote)
Database instance can connect to remote ASM instance if case the local ASM instance fails
ASM network
• Added for Flex ASM for communication between ASM clients and ASM
To facilitate remote ASM connections an ASM listener is required to run on the ASM network. So if you choose flex ASM you will have SCAN listeners, local listeners and ASM listeners.
Database instances to access ASM servers on different nodes:
• Flex ASM uses password file authentication
• ASM password is shared and stored in ASM Disk group