This document discusses best practices for configuring, administering, and troubleshooting Oracle 10g/11g Clusterware. It covers the architecture of Oracle Clusterware, including key components like Cluster Ready Services, Cluster Synchronization Services, and Oracle Notification Services. It also addresses hardware requirements like the public/private network configuration and shared storage, which must store critical Clusterware files like the voting disk and Oracle Cluster Registry. The document provides guidance on hardware setup, Clusterware process architecture, and shared storage configuration to ensure high availability of Oracle Real Application Clusters databases.
Node management in Oracle Clusterware involves monitoring nodes and evicting nodes if necessary to prevent split-brain situations. The CSSD process monitors nodes through network heartbeats over the private interconnect and disk heartbeats using the voting disks. If a node fails to respond within the configured time limits for either heartbeat, it will be evicted from the cluster. Eviction involves sending a "kill request" to the node over the remaining communication channels to forcibly remove it. With Oracle Clusterware 11.2.0.2, reboots of nodes can be avoided by gracefully shutting down the Oracle Clusterware stack instead of an immediate reboot when fencing a node.
This document provides an overview of setting up an Oracle 11gR2 Real Application Clusters (RAC) environment. It discusses system requirements, storage options like SAN and NAS, the Single Client Access Name (SCAN), and components like the Oracle Cluster Registry (OCR) and voting disk. It also explains Oracle Automatic Storage Management (ASM), extent distribution, and provides step-by-step instructions and references for installing Oracle 11gR2 Clusterware and database software on a RAC configuration.
Oracle Clusterware and Private Network Considerations - Practical Performance...Guenadi JILEVSKI
This document discusses Oracle Real Application Clusters (RAC) performance management. It covers RAC fundamentals and infrastructure, analyzing the impact of cache fusion, private interconnect considerations, common problems and symptoms, and diagnostics. The presentation addresses topics like global buffer cache, wait events, session and system statistics, IPC configuration, and network packet processing. It provides advice on tuning applications, the buffer cache, interconnect setup, and avoiding unnecessary parsing or locking to improve RAC performance.
Real Application Cluster (RAC) allows multiple computers to simultaneously run Oracle RDBMS while accessing a single database, providing clustering. RAC provides high availability, scalability, and ease of administration by making multiple instances transparent to users. Nodes must have identical environments. Oracle Clusterware manages node additions and removals. Instances from different nodes write to the same physical database. The presentation covers RAC architecture, components, startup sequence, single instance configuration, node eviction, and tips for monitoring and improving the RAC environment.
The document discusses Oracle Real Application Clusters (RAC) architecture and internals. A typical RAC configuration includes multiple nodes connected to a public network, interconnect, and shared storage. Oracle Grid Infrastructure manages the clusterware and Automatic Storage Management. It provides high availability of databases and other applications by enabling them to run on multiple nodes and utilize the shared storage. The document covers various RAC components like VIPs, listeners, SCAN, client connectivity, node membership, and the interconnect.
Oracle RAC is a clustered version of the Oracle database that uses a shared disk architecture. It allows multiple instances of the database to run concurrently on multiple nodes, providing high availability and scalability. The document discusses how clients can connect to Oracle RAC using SCAN, which provides a single virtual IP address and listener for the entire cluster, making client connections easier to manage. It also covers how SCAN works with load balancing and provides failover between instances in the cluster.
This document provides instructions for implementing an Oracle 11g R2 Real Application Cluster on a Red Hat Enterprise Linux 5.0 system using a two-node configuration. It describes pre-installation steps including hardware and network configuration, installing prerequisite packages and libraries, and configuring the Oracle ASM library driver. Detailed steps are provided for installing Oracle Grid Infrastructure and database software, and configuring the single client access name and storage area network.
1. The document provides instructions for upgrading an Oracle database from version 10g to 11g, including downloading necessary files, applying required patches, and running pre- and post-upgrade scripts.
2. Key steps include verifying prerequisites, installing the 11g binaries, applying required SAP patches, and running the Database Upgrade Assistant to upgrade the database structure to 11g.
3. Upon completion, the upgraded 11g database can be verified by checking the version number returned from the V$INSTANCE and DBA_REGISTRY views.
Node management in Oracle Clusterware involves monitoring nodes and evicting nodes if necessary to prevent split-brain situations. The CSSD process monitors nodes through network heartbeats over the private interconnect and disk heartbeats using the voting disks. If a node fails to respond within the configured time limits for either heartbeat, it will be evicted from the cluster. Eviction involves sending a "kill request" to the node over the remaining communication channels to forcibly remove it. With Oracle Clusterware 11.2.0.2, reboots of nodes can be avoided by gracefully shutting down the Oracle Clusterware stack instead of an immediate reboot when fencing a node.
This document provides an overview of setting up an Oracle 11gR2 Real Application Clusters (RAC) environment. It discusses system requirements, storage options like SAN and NAS, the Single Client Access Name (SCAN), and components like the Oracle Cluster Registry (OCR) and voting disk. It also explains Oracle Automatic Storage Management (ASM), extent distribution, and provides step-by-step instructions and references for installing Oracle 11gR2 Clusterware and database software on a RAC configuration.
Oracle Clusterware and Private Network Considerations - Practical Performance...Guenadi JILEVSKI
This document discusses Oracle Real Application Clusters (RAC) performance management. It covers RAC fundamentals and infrastructure, analyzing the impact of cache fusion, private interconnect considerations, common problems and symptoms, and diagnostics. The presentation addresses topics like global buffer cache, wait events, session and system statistics, IPC configuration, and network packet processing. It provides advice on tuning applications, the buffer cache, interconnect setup, and avoiding unnecessary parsing or locking to improve RAC performance.
Real Application Cluster (RAC) allows multiple computers to simultaneously run Oracle RDBMS while accessing a single database, providing clustering. RAC provides high availability, scalability, and ease of administration by making multiple instances transparent to users. Nodes must have identical environments. Oracle Clusterware manages node additions and removals. Instances from different nodes write to the same physical database. The presentation covers RAC architecture, components, startup sequence, single instance configuration, node eviction, and tips for monitoring and improving the RAC environment.
The document discusses Oracle Real Application Clusters (RAC) architecture and internals. A typical RAC configuration includes multiple nodes connected to a public network, interconnect, and shared storage. Oracle Grid Infrastructure manages the clusterware and Automatic Storage Management. It provides high availability of databases and other applications by enabling them to run on multiple nodes and utilize the shared storage. The document covers various RAC components like VIPs, listeners, SCAN, client connectivity, node membership, and the interconnect.
Oracle RAC is a clustered version of the Oracle database that uses a shared disk architecture. It allows multiple instances of the database to run concurrently on multiple nodes, providing high availability and scalability. The document discusses how clients can connect to Oracle RAC using SCAN, which provides a single virtual IP address and listener for the entire cluster, making client connections easier to manage. It also covers how SCAN works with load balancing and provides failover between instances in the cluster.
This document provides instructions for implementing an Oracle 11g R2 Real Application Cluster on a Red Hat Enterprise Linux 5.0 system using a two-node configuration. It describes pre-installation steps including hardware and network configuration, installing prerequisite packages and libraries, and configuring the Oracle ASM library driver. Detailed steps are provided for installing Oracle Grid Infrastructure and database software, and configuring the single client access name and storage area network.
1. The document provides instructions for upgrading an Oracle database from version 10g to 11g, including downloading necessary files, applying required patches, and running pre- and post-upgrade scripts.
2. Key steps include verifying prerequisites, installing the 11g binaries, applying required SAP patches, and running the Database Upgrade Assistant to upgrade the database structure to 11g.
3. Upon completion, the upgraded 11g database can be verified by checking the version number returned from the V$INSTANCE and DBA_REGISTRY views.
Understanding Oracle RAC 12c Internals as presented during Oracle Open World 2013 with Mark Scardina.
This is part two of the Oracle RAC 12c "reindeer series" used for OOW13 Oracle RAC-related presentations.
This document provides information on upgrading Oracle Clusterware from version 11gR2 to 12cR1. It begins with an introduction to the presenter and their experience. The agenda then outlines discussing introduction to Clusterware, prerequisites for upgrade, differences between traditional and Flex clusters, the upgrade process, recovering from failures, downgrade process, and tips for monitoring the RAC environment.
Since the manageability of RMAN backup, restore and recovery operations are nearly identical for nonclustered and clustered databases, the objective of this presentation is summarize you how RMAN can be best utilized in a RAC database.
The document provides information about finding the location of OCR and voting disks in an Oracle RAC environment. It states that the OCR location can be found in the /etc/oracle/ocr.loc file and the voting disk location can be found using the crsctl query css votedisk command. It also provides information on backing up the OCR and voting disks, such as using dd to backup voting disks and ocrconfig to backup and restore OCR.
Oracle Real Application Cluster (RAC) allows multiple instances of an Oracle database to run simultaneously on multiple nodes. It provides high availability, scalability, and transparent application failover. Key components include shared storage, Oracle Clusterware, cache fusion for data synchronization, and Transparent Application Failover for uninterrupted connections.
The document discusses Walgreen Company's migration from a single instance Oracle database to a RAC configuration using Oracle Data Guard for zero downtime. It describes the analysis that showed the existing architecture could not handle growth. The strategy involved building RAC servers, setting up Data Guard replication between the primary and standby RAC clusters, and using DBUA for an in-place upgrade with no downtime. Things to watch out for included validating the environment, understanding XA changes in 11.2.0.2, and monitoring post-migration.
Oracle GoldenGate is a software package for real-time data integration and replication between heterogeneous systems. It enables solutions for high availability, real-time data integration, transactional change data capture, data replication, transformations, and verification. Oracle GoldenGate provides core data movement capabilities as well as visual management and monitoring tools. It supports replication between various database technologies like Oracle, IBM DB2, Microsoft SQL Server, and others. GoldenGate allows organizations to access and deliver real-time enterprise data across different systems.
This document discusses various deployment scenarios and best practices for optimized SF Oracle RAC deployment in Oracle VM Server for SPARC environments using N-port ID virtualization (NPIV) technology. NPIV provides multiple paths to Oracle VM servers (formerly ORACLE VMs) and helps to enable I/O fencing and Veritas Dynamic Multipathing (DMP) to leverage SFHA solutions features in a cost-effective manner within virtualized environments. This configuration also helps to make the
SF Oracle RAC database instance highly available in virtualized environments.
The document discusses new features in Oracle High Availability 11gR2. It describes key features such as Grid Infrastructure, which combines Oracle Clusterware and Automatic Storage Management. It also covers out-of-place upgrades, redundant interconnect, easier addition and removal of nodes, Automatic Cluster File System, cluster-wide commands, Single Client Access Name, and Oracle RAC One Node.
Expert performance tuning tips for Oracle RACSolarWinds
In Oracle RAC 12c here have been significant enhancements to scalability and high availability, with features such as Flex Clusters, Flex ASM, Application Continuity and Transaction Guard, to name just a few. Learn how to make the most of these features, including:
*Operational support enhancements to SRVCTL
*CRSCTL commands
*ADR support for Grid Infrastructure
*Enterprise Manager
*Other support tools such Orachk and TFA analyzer
The biggest headine at the 2009 Oracle OpenWorld was when Larry Ellison announced that Oracle was entering the hardware business with a pre-built database machine, engineered by Oracle. Since then businesses around the world have started to use these engineered systems. This beginner/intermediate-level session will take you through my first 100 days of starting to administer an Exadata machine and all the roadblocks and all the success I had along this new path.
Vbox virtual box在oracle linux 5 - shoug 梁洪响maclean liu
The document describes setting up an Oracle 11g Release 2 RAC environment using VirtualBox virtual machines on Oracle Linux 5.7. It outlines planning the RAC logical architecture and installation requirements. It then details steps to create two virtual machines, install Oracle Linux on them, configure user accounts and directories for the Grid and Oracle software installations, and prepare the systems for the Oracle software installations.
The document discusses best practices for upgrading an Oracle database environment. It recommends upgrading in the following order:
1. Upgrade the clusterware and ASM
2. Install the new RDBMS software
3. Upgrade the databases to the new release
4. Perform post-upgrade steps
The document provides context on when an upgrade may be necessary or beneficial versus staying on an existing version. It also includes a compatibility matrix showing supported upgrade paths between different Oracle releases.
12c (12.1) Database installation on Solaris 11(11.2)K Kumar Guduru
1) The document provides steps to install Oracle Database 12c on Solaris 11.2. It includes installing prerequisite packages, configuring the system, creating the oracle user and groups, and running the Oracle installation.
2) Key steps are configuring the system hostname and IP address, editing configuration files like sshd_config, and installing prerequisite Oracle packages.
3) The Oracle software is extracted, Oracle user and groups are created, and permissions are set on the Oracle inventory and software directories.
4) The Oracle Database installer is run, including selecting a database configuration, providing passwords, and executing root scripts. Post-installation checks confirm the database and listeners are configured correctly.
Oracle RAC 12c Best Practices with Customer Example (Sanger) as presented during Oracle Open World 2013 (OOW13).
This is part one of the Oracle RAC 12c "reindeer series" used for OOW13 Oracle RAC-related presentations.
Installing oracle grid infrastructure and database 12c r1Voeurng Sovann
This document provides instructions for installing Oracle Grid Infrastructure and Oracle Database 12c R1 on a standalone Linux server. It describes how to:
1. Configure the server with required packages, users, groups, and directories for the Oracle software.
2. Install Oracle Grid Infrastructure 12c R1 using the Oracle Universal Installer and configure an ASM disk group and instance.
3. Install Oracle Database 12c R1 software, and use DBCA to create a database called "asmdb" that uses the ASM disk groups for storage and is accessible by the listener called "LISTENER_ASM".
How to protect your sensitive data using oracle database vault / Creating and...Anar Godjaev
Oracle Database Vault allows companies to better protect sensitive data by creating dedicated security accounts, enforcing controls over data access, and separating administration duties. It prevents privileged users like DBAs from accessing application data through realms, command rules, and factors like IP address. During setup of an HR Data Realm, the assistant ensured high-privileged users could still administer the database but not access HR data, and defined controls for what privileged and non-privileged users could do within the realm-protected objects. Realms therefore help secure existing databases against insider threats from stolen privileged credentials.
This document discusses Oracle Real Application Clusters (RAC) and its implementation on Linux. RAC provides scalability, high availability, and increased performance. Key benefits of implementing RAC on Linux include its open source nature, lower costs, and direct technical support from Oracle. The document outlines the hardware, software, and installation steps required to set up Oracle RAC on Linux, including configuring Oracle Cluster File System (OCFS) and Oracle Cluster Management System (OCMS).
Understanding oracle rac internals part 1 - slidesMohamed Farouk
This document discusses Oracle RAC internals and architecture. It provides an overview of the Oracle RAC architecture including software deployment, processes, and resources. It also covers topics like VIPs, networks, listeners, and SCAN in Oracle RAC. Key aspects summarized include the typical Oracle RAC software stack, local and cluster resources, how VIPs and networks are configured, and the role and dependencies of listeners.
Understanding Oracle RAC 12c Internals as presented during Oracle Open World 2013 with Mark Scardina.
This is part two of the Oracle RAC 12c "reindeer series" used for OOW13 Oracle RAC-related presentations.
This document provides information on upgrading Oracle Clusterware from version 11gR2 to 12cR1. It begins with an introduction to the presenter and their experience. The agenda then outlines discussing introduction to Clusterware, prerequisites for upgrade, differences between traditional and Flex clusters, the upgrade process, recovering from failures, downgrade process, and tips for monitoring the RAC environment.
Since the manageability of RMAN backup, restore and recovery operations are nearly identical for nonclustered and clustered databases, the objective of this presentation is summarize you how RMAN can be best utilized in a RAC database.
The document provides information about finding the location of OCR and voting disks in an Oracle RAC environment. It states that the OCR location can be found in the /etc/oracle/ocr.loc file and the voting disk location can be found using the crsctl query css votedisk command. It also provides information on backing up the OCR and voting disks, such as using dd to backup voting disks and ocrconfig to backup and restore OCR.
Oracle Real Application Cluster (RAC) allows multiple instances of an Oracle database to run simultaneously on multiple nodes. It provides high availability, scalability, and transparent application failover. Key components include shared storage, Oracle Clusterware, cache fusion for data synchronization, and Transparent Application Failover for uninterrupted connections.
The document discusses Walgreen Company's migration from a single instance Oracle database to a RAC configuration using Oracle Data Guard for zero downtime. It describes the analysis that showed the existing architecture could not handle growth. The strategy involved building RAC servers, setting up Data Guard replication between the primary and standby RAC clusters, and using DBUA for an in-place upgrade with no downtime. Things to watch out for included validating the environment, understanding XA changes in 11.2.0.2, and monitoring post-migration.
Oracle GoldenGate is a software package for real-time data integration and replication between heterogeneous systems. It enables solutions for high availability, real-time data integration, transactional change data capture, data replication, transformations, and verification. Oracle GoldenGate provides core data movement capabilities as well as visual management and monitoring tools. It supports replication between various database technologies like Oracle, IBM DB2, Microsoft SQL Server, and others. GoldenGate allows organizations to access and deliver real-time enterprise data across different systems.
This document discusses various deployment scenarios and best practices for optimized SF Oracle RAC deployment in Oracle VM Server for SPARC environments using N-port ID virtualization (NPIV) technology. NPIV provides multiple paths to Oracle VM servers (formerly ORACLE VMs) and helps to enable I/O fencing and Veritas Dynamic Multipathing (DMP) to leverage SFHA solutions features in a cost-effective manner within virtualized environments. This configuration also helps to make the
SF Oracle RAC database instance highly available in virtualized environments.
The document discusses new features in Oracle High Availability 11gR2. It describes key features such as Grid Infrastructure, which combines Oracle Clusterware and Automatic Storage Management. It also covers out-of-place upgrades, redundant interconnect, easier addition and removal of nodes, Automatic Cluster File System, cluster-wide commands, Single Client Access Name, and Oracle RAC One Node.
Expert performance tuning tips for Oracle RACSolarWinds
In Oracle RAC 12c here have been significant enhancements to scalability and high availability, with features such as Flex Clusters, Flex ASM, Application Continuity and Transaction Guard, to name just a few. Learn how to make the most of these features, including:
*Operational support enhancements to SRVCTL
*CRSCTL commands
*ADR support for Grid Infrastructure
*Enterprise Manager
*Other support tools such Orachk and TFA analyzer
The biggest headine at the 2009 Oracle OpenWorld was when Larry Ellison announced that Oracle was entering the hardware business with a pre-built database machine, engineered by Oracle. Since then businesses around the world have started to use these engineered systems. This beginner/intermediate-level session will take you through my first 100 days of starting to administer an Exadata machine and all the roadblocks and all the success I had along this new path.
Vbox virtual box在oracle linux 5 - shoug 梁洪响maclean liu
The document describes setting up an Oracle 11g Release 2 RAC environment using VirtualBox virtual machines on Oracle Linux 5.7. It outlines planning the RAC logical architecture and installation requirements. It then details steps to create two virtual machines, install Oracle Linux on them, configure user accounts and directories for the Grid and Oracle software installations, and prepare the systems for the Oracle software installations.
The document discusses best practices for upgrading an Oracle database environment. It recommends upgrading in the following order:
1. Upgrade the clusterware and ASM
2. Install the new RDBMS software
3. Upgrade the databases to the new release
4. Perform post-upgrade steps
The document provides context on when an upgrade may be necessary or beneficial versus staying on an existing version. It also includes a compatibility matrix showing supported upgrade paths between different Oracle releases.
12c (12.1) Database installation on Solaris 11(11.2)K Kumar Guduru
1) The document provides steps to install Oracle Database 12c on Solaris 11.2. It includes installing prerequisite packages, configuring the system, creating the oracle user and groups, and running the Oracle installation.
2) Key steps are configuring the system hostname and IP address, editing configuration files like sshd_config, and installing prerequisite Oracle packages.
3) The Oracle software is extracted, Oracle user and groups are created, and permissions are set on the Oracle inventory and software directories.
4) The Oracle Database installer is run, including selecting a database configuration, providing passwords, and executing root scripts. Post-installation checks confirm the database and listeners are configured correctly.
Oracle RAC 12c Best Practices with Customer Example (Sanger) as presented during Oracle Open World 2013 (OOW13).
This is part one of the Oracle RAC 12c "reindeer series" used for OOW13 Oracle RAC-related presentations.
Installing oracle grid infrastructure and database 12c r1Voeurng Sovann
This document provides instructions for installing Oracle Grid Infrastructure and Oracle Database 12c R1 on a standalone Linux server. It describes how to:
1. Configure the server with required packages, users, groups, and directories for the Oracle software.
2. Install Oracle Grid Infrastructure 12c R1 using the Oracle Universal Installer and configure an ASM disk group and instance.
3. Install Oracle Database 12c R1 software, and use DBCA to create a database called "asmdb" that uses the ASM disk groups for storage and is accessible by the listener called "LISTENER_ASM".
How to protect your sensitive data using oracle database vault / Creating and...Anar Godjaev
Oracle Database Vault allows companies to better protect sensitive data by creating dedicated security accounts, enforcing controls over data access, and separating administration duties. It prevents privileged users like DBAs from accessing application data through realms, command rules, and factors like IP address. During setup of an HR Data Realm, the assistant ensured high-privileged users could still administer the database but not access HR data, and defined controls for what privileged and non-privileged users could do within the realm-protected objects. Realms therefore help secure existing databases against insider threats from stolen privileged credentials.
This document discusses Oracle Real Application Clusters (RAC) and its implementation on Linux. RAC provides scalability, high availability, and increased performance. Key benefits of implementing RAC on Linux include its open source nature, lower costs, and direct technical support from Oracle. The document outlines the hardware, software, and installation steps required to set up Oracle RAC on Linux, including configuring Oracle Cluster File System (OCFS) and Oracle Cluster Management System (OCMS).
Understanding oracle rac internals part 1 - slidesMohamed Farouk
This document discusses Oracle RAC internals and architecture. It provides an overview of the Oracle RAC architecture including software deployment, processes, and resources. It also covers topics like VIPs, networks, listeners, and SCAN in Oracle RAC. Key aspects summarized include the typical Oracle RAC software stack, local and cluster resources, how VIPs and networks are configured, and the role and dependencies of listeners.
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Sandesh Rao
In this session, I will cover under-the-hood features that power Oracle Real Application Clusters (Oracle RAC) 19c specifically around Cache Fusion and Service management. Improvements in Oracle RAC helps in integration with features such as Multitenant and Data Guard. In fact, these features benefit immensely when used with Oracle RAC. Finally we will talk about changes to the broader Oracle RAC Family of Products stack and the algorithmic changes that helps quickly detect sick/dead nodes/instances and the reconfiguration improvements to ensure that the Oracle RAC Databases continue to function without any disruption
This document provides a high-level overview and summary of Oracle Real Application Clusters (RAC) architecture and internals. It begins with an introduction and agenda, then covers key topics like Oracle Clusterware architecture and components, the interconnect, public network and virtual IPs, startup and shutdown processes, advanced RAC features, and includes pictures to illustrate concepts. The presentation is intended to demystify and explain the general workings and components that make up an Oracle RAC environment.
This document provides best practices for configuring Oracle GoldenGate to work with Oracle Real Application Clusters (RAC), Oracle Clusterware, and Oracle Database File System (DBFS) or Oracle ASM Cluster File System (ACFS). It recommends storing GoldenGate files in DBFS or ACFS for high availability, describes how to set up each file system, and provides steps for installing and configuring GoldenGate for an Oracle RAC environment.
Clustering can provide high availability and scalability. Shared nothing architectures are best for achieving both high availability and scalability together. Oracle Real Application Cluster (RAC) offers advantages over alternative Oracle clustering configurations, but its scalability is limited. The cost-effectiveness of using RAC in a redundant array of inexpensive servers configuration is small due to its limited scalability. Alternatives may be more suitable depending on specific needs and requirements.
This document summarizes a presentation on Oracle RAC (Real Application Clusters) internals with a focus on Cache Fusion. The presentation covers:
1. An overview of Cache Fusion and how it allows data to be shared across instances to enable scalability.
2. Dynamic re-mastering which adjusts where data is mastered based on access patterns to reduce messaging.
3. Techniques for handling contention including partitioning, connection pools, and separating redo logs.
4. Benefits of combining Oracle Multitenant and RAC such as aligning PDBs to instances.
5. How Oracle In-Memory Column Store fully integrates with RAC including fault tolerance features.
This document discusses best practices for configuring Oracle GoldenGate to work with Oracle Real Application Clusters (RAC), Oracle Clusterware, and Oracle Database File System (DBFS). It recommends storing Oracle GoldenGate trail files, checkpoint files, and other files in DBFS to provide high availability, recoverability, and failover capabilities. The document outlines steps to set up DBFS on RAC, install Oracle GoldenGate, configure Oracle GoldenGate and database parameters, set up files in DBFS, configure commit behavior and autostart processes, and configure Oracle Clusterware.
Real Application Clusters (RAC) allow an Oracle database to run across multiple interconnected servers that appear as a single database to users. RAC provides high availability, scalability, and manageability. There are two types of RAC configurations: active/passive where one node is active and the other passive, and active/active where instances run concurrently on both servers and clients can access both. Automatic Storage Management (ASM) is an Oracle technology that manages disk storage for Oracle databases.
Unleash oracle 12c performance with cisco ucssolarisyougood
This document discusses performance testing of Oracle 12c on Cisco UCS blade servers. An 8-node Oracle RAC cluster was tested achieving 750K IOPS and 25GB/sec bandwidth. OLTP workloads achieved 330K IOPS and DSS workloads achieved 17GB/sec bandwidth running together. Pluggable databases were also compared to traditional containers, showing higher throughput with pluggable databases. Various hardware failures were tested to demonstrate high availability of the Oracle RAC cluster on Cisco UCS.
Oracle RAC 12c Practical Performance Management and Tuning as presented during Oracle Open World 2013 with Michael Zoll.
This is part three of the Oracle RAC 12c "reindeer series" used for OOW13 Oracle RAC-related presentations.
This part concludes the main part of the "reindeer series" except for one bonus track "Oracle Multitenant meets Oracle RAC 12c" (available via SlidesShare, too).
This document provides steps to configure replication between an Oracle database and a MySQL database using Oracle GoldenGate. It outlines installing GoldenGate on the source Oracle and target MySQL databases, setting up the necessary directories, creating and loading sample tables on the Oracle source, and starting the GoldenGate Manager process on both databases. The replication process will then capture changes on the Oracle source and replicate them to the MySQL target in real-time.
Oracle exalytics deployment for high availabilityPaulo Fagundes
This white paper contains technical details about how to configure a two node cluster of Oracle Exalytics systems running Oracle Business Intelligence Enterprise Edition with Oracle Exadata to enable high availability. The Maximum Availability Architecture (MAA) configuration requires horizontal scale out to be configured for scalability and performance (load balancing).
Oracle Clusterware is Oracle's cluster software solution that provides high availability and protection for Oracle and third-party applications. It integrates components like Automatic Storage Management, messaging and locking services, membership and connectivity services. Oracle Clusterware can protect any Oracle software or third-party software that uses an Oracle database, running on certified Oracle Linux nodes, at no extra cost beyond standard Oracle licensing. It provides virtual IP addresses, framework for application management, and a cluster file system for configuration files.
The document describes the architecture of Oracle 11g. It discusses the key components of an Oracle instance which includes the System Global Area (SGA) and background processes. The SGA is made up of several memory structures like the shared pool, database buffer cache, and redo log buffer. The document also describes the various mandatory and optional background processes like DBWR, LGWR, SMON, PMON and their functions.
This document provides instructions for diagnosing Oracle Clusterware and Real Application Clusters (RAC) high availability components. It describes how to enable and disable Oracle Clusterware daemons, collect diagnostic information, view alerts, enable resource debugging, check clusterware health, and view log files. It also covers diagnosing RAC components, enabling additional tracing, and resolving pending shutdown issues.
This document provides instructions for diagnosing Oracle Clusterware and Real Application Clusters (RAC) high availability components. It describes how to enable and disable Oracle Clusterware daemons, collect diagnostic information, view alerts, enable resource debugging, check clusterware health, and view log files. It also covers diagnosing RAC components, enabling additional tracing, and resolving pending shutdown issues.
Similar to Best practices oracle_clusterware_session355_wp (20)
This document provides guidance for creating an effective patch management program for industrial control systems. It recommends establishing several key plans: a configuration management plan to track systems and software; a patch management plan to evaluate vulnerabilities and deploy patches; a backup/archive plan to preserve system states; a patch testing plan to validate patches; an incident response plan to address vulnerabilities; and a disaster recovery plan in case patches fail. When these plans are integrated and their resources leveraged together, they provide a robust approach to patch management that balances security, reliability and operational needs.
The Medicaid Information Technology Architecture (MITA) is both an initiative and framework developed by CMS to promote improvements in Medicaid systems and processes. As an initiative, MITA provides a plan for collaboration between CMS and states. As a framework, MITA offers models, guidelines and principles for states to implement enterprise solutions. The goal of MITA is to change how states design, build and modify Medicaid systems and perform IT planning according to national guidelines. The MITA framework consists of a business architecture, information architecture and technical architecture. The business architecture describes business needs, goals and processes. The information architecture identifies Medicaid data and provides models and strategies for data management. The technical architecture provides guidance on technologies, tools and standards for implementing MITA.
States are working to apply healthcare information technology (HIT) within their Medicaid programs to improve efficiency and reduce costs. Through initiatives like the Medicaid Transformation Grants and the Medicaid Information Technology Architecture (MITA) framework, states are using HIT to enhance clinical decision making, expand e-prescribing, and provide electronic health records for Medicaid beneficiaries. Going forward, organizations like the National Association of State Medicaid Directors and the State Alliance for e-Health continue to support greater adoption of HIT within Medicaid.
Facets Overview and Navigation User Guide.pdfwardell henley
This document provides an overview and introduction to the Facets managed healthcare system. It begins with the lesson objectives, which include being introduced to Facets navigation, applications, and how to use the system. It then provides a high-level overview of Facets, explaining what it is, why Sierra Health Services implemented it, and an overview of the implementation process. The majority of the document introduces and describes the main Facets application groups and their related applications. These include accounting, billing, claims processing, customer service, and other groups. It concludes by explaining how to open and close Facets, explore the product navigation window, and change passwords.
This document provides guidance for contractors on conducting self-inspections of their security programs as required by the National Industrial Security Program Operating Manual (NISPOM). It outlines a three-step self-inspection process of pre-inspection, inspection, and post-inspection. The pre-inspection involves planning, while the inspection involves reviewing security practices against a checklist. Areas to check include classified storage, training, visits, and more. The post-inspection involves reporting findings, taking corrective actions, and briefing management. Interview techniques are also provided to verify security compliance with employees. The goal is for contractors to regularly inspect and improve their security programs.
The document provides guidance on Change Advisory Board (CAB) meetings in ITIL. It outlines that the Change Manager chairs CAB meetings to review pending changes and authorize or reject them. The CAB considers each change's risks, impacts, and resources required. It documents any issues, recommendations, and authorizations. Changes that the CAB cannot decide on may be escalated to higher levels for a final determination.
Boston Financial's Information Security Program is committed to ensuring customer data is protected from unauthorized access through a layered security approach. The program employs risk assessment, security policies and standards, awareness training, and a dedicated security team led by a Chief Information Security Officer to prevent breaches and adhere to industry best practices and compliance standards. The scope of the program encompasses security administration, technology infrastructure, and policy management to consistently monitor threats and protect customer information.
This document discusses Good Manufacturing Practices (GMP) for pharmaceuticals. It introduces GMP, explaining that GMP ensures pharmaceutical products are consistently manufactured and controlled to quality standards for their intended use. It also discusses the relationships between quality assurance (QA), GMP, and quality control (QC), noting that QA oversees the whole system, GMP ensures consistent manufacturing quality, and QC tests samples to ensure quality. Finally, it provides an overview of key aspects of GMP, including facilities, equipment, packaging, and documentation.
This document is a checklist and certification for information security program requirements for an RFP or contract. It requires identification of the type of information involved, applicable security categories and position sensitivity designations. It also requires certification by a Project Officer and Information Systems Security Officer that appropriate security requirements are specified to protect government interests in compliance with federal standards.
The document outlines the basic components of an information security program for mortgage industry professionals. It discusses 13 first priority cybersecurity practices like managing risk, protecting systems from malware, patching systems, and training employees. It also discusses 10 second priority practices such as encrypting sensitive data, third party risk management, and disaster recovery planning. The document is intended to provide a succinct overview of security risks and basic practices to help small and medium businesses manage those risks.
Best practices for_implementing_security_awareness_trainingwardell henley
- Security professionals are most concerned about data breaches, phishing, spearphishing, and ransomware attacks. These threats can be addressed through effective security awareness training.
- The vast majority of surveyed organizations had experienced security incidents like phishing attacks delivering malware, targeted email attacks, or data breaches in the past year.
- Over 90% of organizations report that phishing and spearphishing attempts reaching end users have increased or stayed the same over the past 12 months, indicating ongoing threats.
DMARC requires alignment between the authentication domain from SPF or DKIM and the From header domain. SPF authenticates the MAIL FROM or EHLO domain, while DKIM authenticates the domain in the d= tag. Alignment can be strict, requiring an exact match, or relaxed, allowing subdomains. DMARC policies use tags to specify whether strict or relaxed alignment is needed for SPF (aspf) or DKIM (adkim) to pass DMARC verification. Alignment helps validate the authenticity of the From header by linking it to the authenticated domains.
This document discusses security considerations for service-oriented architectures (SOA) and on-demand environments. It describes several subsystems that are important for a comprehensive security management architecture (MASS), including access control, identity and credential management, information flow control, security auditing, and solution integrity. Technologies that can be used to implement each subsystem are also outlined, such as directories, firewalls, encryption, and systems management solutions. The document stresses that security requires an integrated approach across all of these areas.
This chapter discusses security architecture and models, including computer organization, distributed systems, protection mechanisms, formal security models like Bell-LaPadula and Biba integrity models. It covers topics like certification and accreditation processes to establish security requirements are met. Evaluation criteria for security classifications from class D to class A are also mentioned to assess security assurance levels.
This document outlines an enterprise security architecture seminar. It introduces security architecture and the SABSA framework. SABSA is a step-by-step approach that involves identifying business drivers, attributes, threats, control objectives, and security services. A worked example is provided that protects customer information. The seminar covers developing a contextual and conceptual security understanding, logical security architecture components, and deliverables including control frameworks.
This document discusses security architecture and models, including differences between commercial and government security requirements. It covers security evaluation criteria, security practices for the Internet, technical platforms in terms of hardware/software, and system security techniques like preventative, detective and corrective controls. The document also describes the layered approach to security architecture.
This document provides a checklist for securing a Splunk software installation, including setting up authenticated users and managing access, using certificates to encrypt communications, hardening Splunk instances, setting consistent passwords across servers, securing service accounts, auditing the system regularly to monitor access and activities, and monitoring files and directories.
The document provides best practices for using the Splunk App for Windows Infrastructure, including synchronizing clocks, configuring Active Directory monitoring, limiting baseline data collection, and disabling SID translation to reduce domain controller resource usage. Specific configuration changes are recommended in the Splunk_TA_microsoft_ad and Splunk_TA_Windows add-ons to implement these practices.
This document provides an overview of Enterprise Content Management (ECM) and discusses how ECM solutions can be supported by various storage technologies and solutions. It begins with introductions to ECM and storage concepts for specialists in the opposite fields. It then discusses business drivers for ECM and provides a reference architecture for matching ECM requirements to appropriate storage strategies. The reference architecture addresses requirements for security, integrity, retention, availability and cost, among others. It also covers storage considerations for availability, backup/recovery, business continuity and capacity planning.
Oracle Services Procurement allows organizations to better manage services spending. It enables defining complex payment terms for services contracts, tracking project progress and payments, and ensuring compliance. The software also enforces the use of preferred suppliers through requisition processes and rate cards. This improves visibility and oversight of an organization's services spending.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfflufftailshop
When it comes to unit testing in the .NET ecosystem, developers have a wide range of options available. Among the most popular choices are NUnit, XUnit, and MSTest. These unit testing frameworks provide essential tools and features to help ensure the quality and reliability of code. However, understanding the differences between these frameworks is crucial for selecting the most suitable one for your projects.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
1. Database
BEST PRACTICES OF ORACLE 10G/11G CLUSTERWARE:
CONFIGURATION, ADMINISTRATION AND TROUBLESHOOTING
Kai Yu, Dell Oracle Solutions Engineering, Dell Inc.
This article aims to provide DBAs some practical understanding and best practices of Oracle Clusterware configuration which
go beyond the traditional installation steps. It discusses the public and private interconnect network and shared storage
configuration, and shares some experiences/tips for troubleshooting clusterware issues such as node eviction and how some
useful diagnostic tools help for the root cause analysis. This presentation also covers some clusterware administration methods
as well as the new 11g clusterware features such as cloning clusterware and adding a node to an existing cluster.
ORACLE CLUSTERWARE ARCHITECTURE
The Oracle Real Applications (RAC) architecture provides the great benefits to the applications:
• High Availability: it automatically fails over the database connections to other cluster node in event of a database node
failure or a planned node maintenance.
• Scalability: Oracle RAC architecture can be scaled out to meet the growth of application workloads by adding
additional nodes to the cluster.
Oracle Clusterware serves a foundation for Oracle RAC database. It provides a set of additional processes running on each
cluster server (node) that allow the cluster nodes to communicate each other so that these cluster nodes work together as if
they were one server serving applications and end users. On each cluster node, Oracle clusterware is configured to manage all
the oracle processes known as cluster resources such as database, database instance, ASM instance, listener and database
services and virtual IP (VIP) services. Oracle clusterware requires a shared storage to store its two components: voting disk
for node membership and Oracle Clusterware Registry (OCR) for cluster configuration information. The private interconnect
network is required among the cluster nodes to carry the network heartbeat among the cluster nodes. Oracle clusterware
consists of several process components which provide the event monitoring, high availability features, process monitoring and
group membership of the cluster. Figure 1 illustrates the architecture of the Oracle clusterware and how it is used in the Oracle
RAC database environment. The following sections will examine each of the components of the clusterware and their
configuration.
HARDWARE CONFIGURATION FOR ORACLE CLUSTWARE
As shown in figure 2, a typical cluster environment consists of one or more servers. In additional to the public network for
applications to access the servers, all the servers in the cluster are also connected by a second network called interconnect
network. The interconnect network is a private network only accessible for the servers in the cluster and is connected by a
private network switch. This interconnect network carries the most important heartbeat communication among the servers in
the cluster. A redundant private interconnect configuration is recommended for a production cluster database environment.
1 Session #355
2. Database
Figure 1 Oracle Clusterware and Oracle RAC Architecture
Figure 2 Hardware Configuration of Oracle Clusterware
The cluster environment also requires a shared storage that is connected to all the cluster servers. The shared storage can be
SAN (Storage Area Network) or NAS (Network Attached storage). The figure 2 shows the example for SAN storage
configuration. To achieve the HA and IO load balancing, redundant switches are recommended. the connections among the
servers, the SAN storage switches and the shared storage are in butterfly shape as shown in figure 2. Depending on the type of
2 Session #355
3. Database
the SAN storage, different IO cards are installed in the servers. For example, for a Fibre Channel (FC) storage, HBA (Host
Buss Agent) cards are installed in the servers, and Fibre Channel switches and Fibre Channel cables are used to connect servers
with FC storage. For an iSCSI type storage, regular network cards and Gigabit Ethernet switches and regular network cables
are used to connect the servers with the storage.
ORACLE CLUSTWARE COMPONENTS PROCESS ARCHITECTURE
Oracle clusterware stores its two configuration files in the shared storage: a voting disk and Oracle Cluster Registry (OCR).
• Voting disk stores the cluster membership information, such as which RAC instances are members of a cluster.
• Oracle Cluster Registry (OCR) stores and manages information about the cluster resources managed by Oracle
clusterware such as Oracle RAC databases, database instance, listeners, VIPs, and servers and applications. It is
recommended to have multiplexed OCR to ensure the high availability.
Oracle clusterware consists of the following components that facilitate cluster operations. These components run as processes
on Linux/Unix or run as services on Windows:
• Cluster Ready Services (CRS) manages cluster resources such as databases, instances, services, listeners and virtual IP
(VIP) address, , applications address, etc. It reads the resource configuration information from OCR . It also starts,
stops and monitors these resources and generates the event if these resource status change.
Process on Linux:
root 9204 1 0 06:03 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
root 10058 9204 0 06:03 ? 00:01:23 /crs/product/11.1.0/crs/bin/crsd.bin reboot
• Cluster Synchronization Services (CSSD) manages the Node membership by checking the heartbeats and checking voting
disk to find if there is a failure of any cluster node. It also provides Group membership services and notify the
member of the cluster about the membership status changes.
Processes on Linux:
root 9198 1 0 06:03 ? 00:00:22 /bin/sh /etc/init.d/init.cssd fatal
root 10095 9198 0 06:03 ? 00:00:00 /bin/sh /etc/init.d/init.cssd oprocd
root 10114 9198 0 06:03 ? 00:00:00 /bin/sh /etc/init.d/init.cssd oclsomon
root 10151 9198 0 06:03 ? 00:00:00 /bin/sh /etc/init.d/init.cssd daemon
oracle 10566 10151 0 06:03 ? 00:00:40 /crs/product//11.1.0/crs/bin/ocssd.bin
• Event Mangement (EVM) publishes the events that are created by the Oracle clusterware using Oracle Notification
Services(ONS). It communicates with CSS and CSS.
Processes on Linux
root 9196 1 0 06:03 ? 00:00:00 /bin/sh /etc/init.d/init.evmd run
root 10059 9196 0 06:03 ? 00:00:00 /bin/su -l oracle -c sh -c 'ulimit -c
unlimited; cd /crs /product/11.1.0/crs/log/kblade2/evmd; exec
/crs/product/11.1.0/crs/bin/evmd
oracle 10060 10059 0 06:03 ? 00:00:02 /crs/product/11.1.0/crs/bin/evmd.bin
• Oracle Notification Services(ONS) publishes and subscribe service for communicating Fast Application Notification
(FAN) events.
Processes on Linux:
oracle 12063 1 0 06:06 ? 00:00:00
/crs/oracle/product/11.1.0/crs/opmn/bin/ons -d
oracle 12064 12063 0 06:06 ? 00:00:00
/crs/oracle/product/11.1.0/crs/opmn/bin/ons -d
3 Session #355
4. Database
• Oracle Process Monitor Daemon (OPROCD) is locked in memory and monitors the cluster. OPROCD provide I/O
fencing. Starting with 10.2.0.4, it replace hangcheck timer module on Linux. If OPROCS fails, clusterware will reboot
the nodes.
Processes on Linux:
root 9198 1 0 06:03 ? 00:00:22 /bin/sh /etc/init.d/init.cssd fatal
root 10095 9198 0 06:03 ? 00:00:00 /bin/sh /etc/init.d/init.cssd oprocd
root 10465 10095 0 06:03 ? 00:00:00 /crs/product/11.1.0/crs/bin/oprocd run -t
1000 -m 500 -f
• RACG extends the clusterware to support Oracle-specific requirements and complex resources
Processes on Linux:
oracle 12039 1 0 06:06 ? 00:00:00
/opt/oracle/product/11.1.0/asm/bin/racgimon daemon ora.kblade2.ASM2.asm
oracle 12125 1 0 06:06 ? 00:00:06
/opt/oracle/product/11.1.0/db_1/bin/racgimon startd test1db
SHARED STORAGE CONFIGURATION FOR ORACLE CLUSTERWARE
STORAGE REQUIREMENT
Two of most important clusterware components: Voting disk, OCR must be stored in a shared storage which is accessible to
every node in the cluster. The shared storage can be block devices, RAW devices, clustered file system like Oracle Cluster file
system OCFS and OCFS2 and a network file system (NFS) from a certified network attached storage (NAS) devices. To verify
if the NAS devices are certified for Oracle, check the Oracle Storage Compatibility Program list at the following website:
http://www.oracle.com/technology/deploy/availability/htdocs/oscp.html.
Since OCR and voting disks play the crucial role in the clusterware configuration, to ensure the high availability, a minimum of
three voting disks and two mirrored copies of OCR are recommended . If a single copy of OCR and voting disk is used, an
external mirroring Raid configuration in the shared storage should be used to provide the redundancy. A cluster can have up
to 32 voting disks.
PHYSICAL CONNECTTIONS TO SHARED SAN STORAGE
In order to ensure the high availability and scalability on the IO loads, the fully redundant active-active IO paths are
recommended to connect the server nodes and the shared storage. For SAN (Storage Area Network) storages such as Fibre
Channel storage or iSCSI storage, these redundant paths include three components:
• HBA cards/NIC cards:, two HBA (Host Bus Adapter) cards are installed in each cluster server node for Fibre
Channel storage connection. Multiple NIC cards are dedicated for iSCSI storage connections.
• Storage switches: Two Fibre Channel switches for Fibre Channel storage or the regular Gigabit Ethernet switches for
iSCSI storage.
• A shared storage with two storage processors.
The connections among these three components are in butterfly shape. As examples, figure 3 and figure 4 show how servers
are connected with a Fibre Channel storage and an iSCSI storage respectively.
4 Session #355
5. Database
Figure 3 Connection of clustered servers with EMC Fibre Channel storage.
In figure 3, additional switch IO paths are introduced. This redundant configuration provides the high availability and IO
load balancing. For example, if an HBA fails, or a switch fails and one storage controller fails, the IO path will fail over to the
remaining HBA, switch or storage controller. During the normal operation, these active-active HBAs, switches and storage
processors share the IO loads. For the detailed information about how to configure FC storage as the shared storage for
Oracle clusterware and Oracle database, refer to [6] in the reference list.
5 Session #355
6. Database
Figure 4 Connections of Cluster Servers with EqualLogic iSCSI storage
In figure 4, two servers connect with EqualLogic iSCSI storage through two Gigabit Ethernet switches. Each server uses
multiple NIC cards to connect two redundant iSCSI switches each of which also connects to two redundant two storage
control modules. In an event of a NIC card failure or a switch failure or even a control module failure, the IO path will
automatically failover other NIC card or switch or control module. For the detailed information about how to configure
EqualLogic storage as the shared storage for Oracle clusterware and Oracle database, refer to [6] in the reference list.
MULTIPATH DEVICES OF THE SHARED STORAGE
As multiple IO paths are established in the hardware level, OS like Linux as well as some third party storage vendors offer IO
multipathing device driver which combines multiple IO paths into a virtual IO path and provide IO path failover , IO
bandwidth aggregations.
One common used multipath device driver is the Linux native Device Mapper on Enterprise Linux(RHEL5 or OEL5). The
Device Mapper(DM) is installed through RPMs from OS CD. To verify the installation of the RPM by running this command:
$rpm –qa | grep device-mapper
device-mapper-1.02.28-2.el5
device-mapper-event-1.02.28-2.el5
device-mapper-multipath-0.4.7-23.el5
device-mapper-1.02.28-2.el5
The following example shows how to use the Linux native Device Mapper to implement multipathing for shared iSCSI
storage. After the iSCSI shared storage is configured, /proc/partitions lists the following iSCSI storage partitions: sdb, sdc,
sdd, sde, sdf, sdg.
Then identify the unique SCSI id for each device:
[root@kblade1 sbin]# /sbin/scsi_id -gus /block/sdb
36090a028e093fc906099540639aa2149
[root@kblade1 sbin]# /sbin/scsi_id -gus /block/sde
36090a028e093fc906099540639aa2149
[root@kblade1 sbin]# /sbin/scsi_id -gus /block/sdc
36090a028e093dc7c6099140639aae1c7
[root@kblade1 sbin]# /sbin/scsi_id -gus /block/sdf
36090a028e093dc7c6099140639aae1c7
[root@kblade1 sbin]# /sbin/scsi_id -gus /block/sdg
36090a028e093cc896099340639aac104
root@kblade1 sbin]# /sbin/scsi_id -gus /block/sdd
36090a028e093cc896099340639aac104
The results show that sdb and sde are from to the same storage volume, so are sdc and sdf, and sdg and sdd.
Each of duplicated paths correspond one IO path, which starts from one of two NIC cards installed in the servers. Next step
is to configure the multipath driver to combine these two IO paths into one virtual IO path by adding the following entries to
the multipating driver configuration file /etc/multipath.conf:
multipaths {
multipath {
wwid 36090a028e093dc7c6099140639aae1c7 #<--- for sdc and sdf
6 Session #355
7. Database
alias ocr1
}
multipath {
wwid 36090a028e093fc906099540639aa2149 #<---- for sdb and sde
alias votingdisk1
}
multipath {
wwid 36090a028e093cc896099340639aac104 #<---- for sdf and sdg
alias data1
}
}
Then restart the multipath deamon and verify the alias names:
[root@kblade1 etc]# service multipathd restart
Stopping multipathd daemon: [FAILED]
Starting multipathd daemon: [ OK ]
List the multiple path devices by performing command:
[root@kblade1 etc]# multipath -ll
Verify the multipathing devices that are created:
[root@kblade1 etc]# ls -lt /dev/mapper/*
brw-rw---- 1 root disk 253, 10 Feb 18 02:02 /dev/mapper/data1
brw-rw---- 1 root disk 253, 8 Feb 18 02:02 /dev/mapper/votingdisk1
brw-rw---- 1 root disk 253, 9 Feb 18 02:02 /dev/mapper/ocr1
These multipathing devices are available for voting disk and OCR as well as the ASM diskgroup.
Besides of the native Linux device mapper shown above, some third party vendors also offer the multipath driver. For
example, EMC Powerpath driver for EMC Fibre Channel storage. With EMC powerpath driver, multiuple IO paths share the
I/O workload with intelligent multipath load balancing feature, and the automatic failover feature ensures the high availability
in the event of a failure.
To install EMC Powerpath and naviagent software in Linux, load two Linux rpms:
rpm –ivh EMCpower.LINUX-5.1.2.00.00-021.rhel5.x86_64.rpm
rpm –ivh naviagentcli-6.24.2.5.0-1.noarch.rpm
Then start the naviagent agent dameon and the EMC PowerPath demon:
service naviagent start
service PowerPath start
you will see the EMC pseudo devices such as:
[root@lkim3 software]# more /proc/partitions | grep emc
120 0 419430400 emcpowera
120 16 2097152 emcpowerb
120 32 419430400 emcpowerc
120 48 419430400 emcpowerd
120 64 419430400 emcpowere
Use powermt utility to check the mapping of the EMC pseudo device emcpowerc and its IO paths:
[root@lkim3 software]# powermt display dev=emcpowerc
Pseudo name=emcpowerc
7 Session #355
8. Database
CLARiiON ID=APM00083100777 [kim_proc]
Logical device ID=6006016013CB22009616938A8CF1DD11 [LUN 2]
state=alive; policy=BasicFailover; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B Array failover mode: 1
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
==============================================================================
2 lpfc sdc SP B1 active alive 0 0
2 lpfc sdh SP A0 active alive 0 0
This shows that EMC pseudo emcpowerc is mapping to the logic device of LUN2 by two IO paths: sdc and sdh. And the
/dev/emcpowerc is the pseudo device name of the shared storage which can be used for OCR or voting disk or ASM
diskgroup, etc.
BLOCK DEVICES VS RAW DEVICES
In Linux world, raw devices have played an important role in Oracle clusterware and Oracle RAC as Oracle can access the
unstructured data on block devices by binding to them to character raw devices. Starting with Linux kernel 2.6
(RHEL5/OEL5), support for raw devices have been depreciated in favor of block devices. For example, Red Hat Linux 5
no longer offers raw devices service. So for a long term solution, one should consider moving away from raw devices to block
devices.
While 11g clusterware fully supports building OCR and voting disk on the block devices, for Oracle 10gR2 clusterware ,
Oracle universal installer (OUI) doesn’t allow one to build OCR and voting disk directly on block devices. If one needs to
configure Oracle 10g RAC on RHEL 5 or OEL 5, the options are:
A. Use udev rules to set raw devices mapping and permissions.
B. Configure Oracle 11g clusterware using block devices and then install Oracle 10g RAC on top of Oracle 11g
clusterware and use block devices for database files. This is an Oracle certified and recommended solution.
Steps to implement option A:
1. Establish the binding between block devices with raw devices by editing a mapping rule file
/etc/udev/rules.d/65-rasw.rules to add raw binding rules, for example:
# cat /etc/udev/rules.d/65-oracleraw.rules
ACTION=="add", KERNEL=="emcpowera1", RUN+="/bin/raw /dev/raw/raw1 %N"
ACTION=="add", KERNEL=="emcpowerb1", RUN+="/bin/raw /dev/raw/raw2 %N"
ACTION=="add", KERNEL=="emcpowerc1", RUN+="/bin/raw /dev/raw/raw3 %N"
ACTION=="add", KERNEL=="emcpowerd1", RUN+="/bin/raw /dev/raw/raw4 %N"
ACTION=="add", KERNEL=="emcpowere1", RUN+="/bin/raw /dev/raw/raw5 %N"
Then, create a mapping rule file to set up the raw devices ownership and permissions:
/etc/udev/rules.d/89-raw_permissions.rules:
#OCR
KERNEL=="raw1",OWNER="root", GROUP="oinstall", MODE="640"
KERNEL=="raw2",OWNER="root", GROUP="oinstall", MODE="640"
#Votingdisk
KERNEL=="raw3",OWNER="oracle", GROUP="oinstall", MODE="640"
KERNEL=="raw5",OWNER="oracle", GROUP="oinstall", MODE="640"
KERNEL=="raw6",OWNER="oracle", GROUP="oinstall", MODE="640”
8 Session #355
9. Database
2. Start udev /sbin/udev
For 11g RAC and Oracle 10 RAC option B, the recommended solution is to use block devices directly. The only required
step is to set the proper permissions and ownerships of the block devices for OCR and voting disks as well as the ASM
disks in /etc/rc.local file. For example, one can add the following lines to set the proper ownerships and permissions of
block devices for OCRs, voting disks and ASM diskgroups:
# cat /etc/rc.local
# OCR disks 11gR1
chown root:oinstall /dev/mapper/ocr*
chmod 0640 /dev/mapper/ocr*
# Voting disks 11gR1
chown oracle:oinstall /dev/mapper/voting*
chmod 0640 /dev/mapper/voting*
# for ASM diskgroups
chown oracle:dba /dev/mapper/data*
chmod 0660 /dev/mapper/data*
In Linux version with the kernel version older than 2.6 such RHEL 4.x and OEL4.x, raw devices can still used for 10g as well
as 11g clusterware. 11g clusterware also can use block devices directly for its OCRs and voting disks.
NETWORK CONFIGURATION FOR ORACLE CLUSTERWARE
Prior to installation of Oracle clusterware, one of important requirement is a proper configuration of network. The
Clusterware requires three network IP addresses: public IP, virtual IP (VIP) and private interconnect IP on each node in the
cluster.
PUBLIC IP AND VIRTUAL IP CONFIGURATION
Public IP address is the public host name for the node. Virtual IP (VIP) is the public virtual IP address used by clients to
connect the database instances on the node. The advantage for having VIP is when the node fails, Oracle clusterware will
failover the VIP associated with this node to other node. If the clients connect to the database through the hostname (public
IP), if the node dies, the clients will have to wait for TCP/IP timeout which can take as long as 10 minutes before the clients
receive the connection failure error message. However if the client database connection uses VIP, the database connection will
be failed over to other node when the node fails.
The following example shows how the VIP is failed over in case of a node failure: The cluster consists of two nodes: kblade1,
kblade2. During the normal operation, VIP 1 and other components of nodeapps and database instances 1 are running on its
own node kblade1 as:
[oracle@kblade2 ~]$ srvctl status nodeapps -n kblade1
VIP is running on node: kblade1
GSD is running on node: kblade1
Listener is running on node: kblade1
ONS daemon is running on node: kblade1
In an event of node 1 kblade1 failure, kblade1-vip is failed over to kblade2:
[oracle@kblade2 ~]$ srvctl status nodeapps -n kblade1
VIP is running on node: kblade2
GSD is not running on node: kblade1
Listener is not running on node: kblade1
ONS daemon is not running on node: kblade1
9 Session #355
10. Database
If we ping the kblade1-vip at the moment when the kblade1 node is shut down, we can see how kblade1-vip is
failed over to kblade2:
[kai_yu@db ~]$ ping kblade1-vip
PING 155.16.9.171 (155.16.9.171) 56(84) bytes of data.
From 155.16.0.1 icmp_seq=9 Destination Host Unreachable
From 155.16.0.1 icmp_seq=9 Destination Host Unreachable
….. (waiting for 2 seconds before being failed over)
64 bytes from 155.16.9.171: icmp_seq=32 ttl=64 time=2257 ms
64 bytes from 155.16.9.171: icmp_seq=33 ttl=64 time=1258 ms
(at this time kblade1-vip 155.16.9.171 is failed over from kblade1 to kblad2
successfully)
After restarting kblade1, kblade1-vip is moved back to kblade1 as:
[oracle@kblade2 ~]$ srvctl status nodeapps -n kblade1
VIP is running on node: kblade1
GSD is running on node: kblade1
Listener is running on node: kblade1
ONS daemon is running on node: kblade1
PRIVATE INTERCONNECTION CONFIGURATION
As the private interconnect among the cluster nodes play a key role in the stability and performance Oracle RAC and Oracle
clusterware. the following best practices are recommended for private interconnection configuration.
Fully Redundant Ethernet Interconnects:
One purpose of the private interconnect is to carry the network heartbeat among the cluster nodes. If a node can not send
the network heartbeat for certain misscount time, the node will be evicted from cluster and get rebooted. To ensure the
high availability of the private interconnect network, it is recommended to implement a fully redundant Ethernet
interconnects which include two NIC cards per nodes and two interconnect switches. Figure 5 shows how the NIC
cards are connected to the interconnect switches.
Figure 5: Fully Redundant Private Interconnect Configuration
It is recommended to use private dedicated non-rountable switches for the private interconnect. Crossover cables are not
supported for RAC. In additional to the redundant physical connections in hardware level, it also recommended to implement
NIC teaming or bonding in OS software level. This NIC teaming bonds the two physical network interfaces together to
operate under a single logical IP address. This NIC teaming provides the failover capability: in case of a failure of one NIC or
one switch, the network traffic will be routed to the remaining NIC card or remaining switch automatically to ensure the
uninterrupted communication. The following example shows the configuration of NIC bonding interface in Linux:
10 Session #355
11. Database
Edit the network interface scripts in /etc/sysconfig/network-scripts:
ifcfg-eth1: ifcfg-eth2: ifcfg-bond0:
DEVICE=eth1 DEVICE=eth2 DEVICE=eth2
USERCTL=no USERCTL=no IPADDR=192.168.9.52
ONBOOT=yes ONBOOT=yes NETMASK=255.255.255.0
MASTER=bond0 MASTER=bond0 ONBOOT=yes
SLAVE=yes SLAVE=yes BOOTPROTO=none
BOOTPROTO=none BOOTPROTO=none USERCTL=no
TYPE=Ethernet TYPE=Ethernet
Then add the bonding options in /etc/modprobe.conf:
alias bond0 bonding
options bonding miimon=100 mode=1
Enable the boning by restarting the network service :
$service network restart
If you check the network configuration by ifconfig, you will get the bonding configuration like this:
[root@k52950-3-n2 etc]# ifconfig
bond0 Link encap:Ethernet HWaddr 00:18:8B:4E:F0:10
inet addr:192.168.9.52 Bcast:192.168.9.255 Mask:255.255.255.0
inet6 addr: fe80::218:8bff:fe4e:f010/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:328424549 errors:0 dropped:0 overruns:0 frame:0
TX packets:379844228 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:256108585069 (238.5 GiB) TX bytes:338540870589 (315.2 GiB)
eth1 Link encap:Ethernet HWaddr 00:18:8B:4E:F0:10
inet6 addr: fe80::218:8bff:fe4e:f010/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:172822208 errors:0 dropped:0 overruns:0 frame:0
TX packets:189922022 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:132373201459 (123.2 GiB) TX bytes:169285841045 (157.6 GiB)
Interrupt:5 Memory:d6000000-d6012100
eth2 Link encap:Ethernet HWaddr 00:18:8B:4E:F0:10
inet6 addr: fe80::218:8bff:fe4e:f010/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:155602341 errors:0 dropped:0 overruns:0 frame:0
TX packets:189922206 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:123735383610 (115.2 GiB) TX bytes:169255029544 (157.6 GiB)
Base address:0xece0 Memory:d5ee0000-d5f00000
1. Interconnect configuration best Practices
The following best practices have been recommended by Oracle for the private interconnect configuration. Refer to [7]
for details.
• Set UDP send/receive buffers to max
11 Session #355
12. Database
• Use the same interconnect for both Oracle clusterware and Oracle RAC communication
• NIC settings for interconnect:
a. Define control control : set rx=on, tx=off for the NIC cards,
tx/rx flow control should be turned on for the switch(es)
b. Ensure NIC names/slots order identical on all nodes:
c. Configure interconnect NICs on fastest PCI bus
d. Set Jump frame MTU=9000 by adding entry MTU=’9000’ in ifcfg-eth1/ifcfg-eth2 files, same setting for
switches
• Recommended kernel parameters for networking such as:
Oracle 11gR1 Oracle 10gR2
net.core.rmem_default = 4194304 net.core.rmem_default = 4194304
net.core.rmem_max = 4194304 net.core.rmem_max = 4194304
net.core.wmem_default = 262144 net.core.wmem_default = 262144
net.core.wmem_max = 262144 net.core.wmem_max = 262144
• Network hearbeat misscount configuration: 30seconds for 11g , 60 seconds for 10g (default), should not be changed
without Oracle support help.
• Hangcheck timer setting:
Set the proper values for 10g/11g:
modprobe hangcheck-timer hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1
MANANING ORACLE CLUSTERWEARE
After Oracle clusterware is deployed, it is the DBA’s responsibility to manage it and ensure it work properly so that Oracle
RAC database built on top the clusterware can keep functioning normally. The management responsibilities mainly include
the management of the two important components of clusterware: the voting disk and OCR.
MANAGE VOTING DISK
The management tasks for voting disk include backing up, recovering , adding and moving voting disk. To backup voting
disks, first stop the clustereware on all the nodes, then use the command to locate the voting disks:
[root@kblade3 bin]# ./crsctl query css votedisk
0. 0 /dev/mapper/votingdisk1p1
1. 0 /dev/mapper/totingdisk2p1
Use dd command to backup the voting disk to a file, for example:
[root@kblade3 ~]# dd if=/dev/mapper/votingdisk1p1 of=/root/votingdisk_backup bs=4096
In case we need to restore voting diks from a backup, use dd command:
[root@kblade3 ~]# dd if=/ root/votingdisk_backup of=/dev/mapper/votingdisk1p1 bs=4096
To add votingdisk /dev/mapper/votingdisk3p1, stop clsuterware on all nodes, run the command:
crsctl add css votedisk /dev/mapper/votingdisk3p1 -force
To delete voting disk /dev/mapper/votingdisk2p1:
crsctl delete css votedisk /dev/mapper/votingdisk2p1 –force
12 Session #355
13. Database
To move voting disk to new location, first add new voting disk, then remove the old voting disk.
MANAGE ORACLE CLUSTER REGISTRY
Oracle provides three OCR tools: OCRCONFIG, OCRDUMP and OCRCHECK to manage OCR:
ADDING OR REMOVING OR REPLACING OCR
In [4], Oracle recommends to allocate two copies of OCR during the clusterware installation, which will be mirrored each
other and managed automatically by Oracle clusterware. If the mirror copy was not configured during the clusterware
installation, run the following command as root user to add the mirror copy after the installation:
ocrconfig –replace ocrmirror <path of file or disk>, for example,
ocrconfig –replace ocrmirror /dev/mapper/ocr2p1
To change the OCR location, first check the OCR status using ocrcheck as:
[root@kblade3 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 296940
Used space (kbytes) : 15680
Available space (kbytes) : 281260
ID : 1168636743
Device/File Name : /dev/mapper/ocr1p1
Device/File integrity check succeeded
Device/File Name : /dev/mapper/ocr2p1
Device/File integrity check succeeded
Cluster registry integrity check succeeded
Then verify the clusterware status: [root@kblade3 bin]# ./crsctl check crs
Cluster Synchronization Services appears healthy
Cluster Ready Services appears healthy
Event Manager appears healthy
Then run the following command as root user:
ocrconfig –replace ocr /u01/ocr/ocr1
or replace the ocr mirrored copy:
ocrconfig –replace ocrmirror /u01/ocr/ocr2
run ocrconfig –repair for the node if the clusterware is shutdown during the time when OCR is replaced. This command will
make node rejoin the cluster after the node is restarted.
OCR BACKUP AND RECOVERY
Oracle clusterware automatically backs up OCR every four hours and at end of day and at end of week in the default location:
$CRS_HOME/cdata/cluster_name , for example:
[root@kblade3 kblade_cluster]# /crs/oracle/product/11.1.0/crs/bin/ocrconfig –showbackup
kblade3 2009/03/06 00:18:00 /crs/oracle/product/11.1.0/crs/cdata/kblade_cluster/backup00.ocr
kblade3 2009/03/05 20:17:59 /crs/oracle/product/11.1.0/crs/cdata/kblade_cluster/backup01.ocr
kblade3 2009/03/05 16:17:59 /crs/oracle/product/11.1.0/crs/cdata/kblade_cluster/backup02.ocr
13 Session #355
14. Database
kblade2 2009/03/04 18:34:58 /crs/oracle/product/11.1.0/crs/cdata/kblade_cluster/day.ocr
kblade7 2009/02/24 16:24:28 /crs/oracle/product/11.1.0/crs/cdata/kblade_cluster/week.ocr
The manual backup also can be done using the command:
ocrconfig –manualbackup
To restore OCR from a backup file: first identify the backup using ocrconfig -showbackup command, and stop clusterware
on all the cluster nodes; then perform the restore by the restore command: ocrconfig –restore file_name
After the restore, restart the crs and do an OCR integrity check by using cluvfy comp ocr.
OCR EXPORT AND IMPORT
Oracle provides another tool called OCR export and import for backing up and restoring OCR:
To do the OCR export, execute the command: ocrconfig –export /home/oracle/ocr_export
To import the OCR export back to OCR, first stop crs on all the cluster nodes and run the import command:
ocrconfig –import /home/oracle/ocr_export
After the import, start the CRS and check the OCR integrity by performing the command:
cluvfy comp ocr
CLONE ORACLE CLUSTERWARE
The Oracle Cluterware configuration can be cloned from one cluster to another cluster or from one server to another server.
This is called the clusterware cloning process. This process can be used to build a new cluster or simply extend an existing
cluster to an additional node. The following example shows how to use this cloning process to add additional node to the
cluster by simply cloning the clusterware configuration to the new node and then add this new node to the clusterware
configuration.
Task: An existing 11g clusterware configuration includes two nodes: k52950-3-n1 and k52950-3-n2, we wanted to add the third
node k52950-3-n3 to the cluster by using the clusterware cloning method.
Step1 : Complete all the pre-requisite conditions: On the third node, install OS (RHEL5.2) and configure the access to the
shared storage for OCRs and voting disks of the clusterware, configure the public network and private network and VIP on
the third node and configure ssh among all the three nodes.
Step2: copy CRS home to from source node k52950-3-n1 to new node k52950-3-n3:
• Shutdown CRS in the source node.
• In the source node, copy the CRS home to a backup and remove all the log files and trace files from the backup.
• Copy the CRS backup to the new node and create directory /opt/oracle/oraInventory on the new node.
• Set ownership for oracle inventory: chown oracle:oinstall /opt/oracle/oraInventory on the new node.
• Run the preupdate.sh from $CRS_HOME/install on the new node.
Step 3: Run CRS clone process on the new node.
14 Session #355
17. Database
Start CRS on node1 k52950-3-n1 and
execute /crs/oracle/product/11.1.0/crs/install/rootaddnode.sh as
And execute /crs/oracle/product/11.1.0/crs/root.sh on the new node k529503-n3:
17 Session #355
18. Database
Restart CRS on node 2 k52950-3-n2:
[root@k52950-3-n2 bin]# ./crsctl start crs
Attempting to start Oracle Clusterware stack
The CRS stack will be started shortly
So far adding new node k52950-3-n3 is added to the clusterware configuration successfully.
ORACLE CLUSTERWARE TROUBLESSHOTING
SPLIT BRAIN CONDITION AND IO FENCING MECHANISM IN ORACLE CLUSTERWARE
Oracle clusterware provides the mechanisms to monitor the cluster operation and detect some potential issues with the cluster.
One of particular scenarios that needs to be prevented is called split brain condition. A split brain condition occurs when a
single cluster node has a failure that results in reconfiguration of cluster into multiple partitions with each partition forming its
own sub-cluster without the knowledge of the existence of other. This would lead to collision and corruption of shared data as
each sub-cluster assumes ownership of shared data [1]. For a cluster databases like Oracle RAC database, data corruption is a
serious issue that has to be prevented all the time. Oracle clustereware solution to the split brain condition is to provide IO
18 Session #355
19. Database
fencing: if a cluster node fails, Oracle clusterware ensures the failed node to be fenced off from all the IO operations on the
shared storage. One of the IO fencing method is called STOMITH which stands for Shoot the Other Machine in the Head.
In this method, once detecting a potential split brain condition, Oracle clusterware automatically picks a cluster node as a
victim to reboot to avoid data corruption. This process is called node eviction. DBAs or system administrators need to
understand how this IO fencing mechanism works and learn how to troubleshoot the clustereware problem. When they
experience a cluster node reboot event, DBAs or system administrators need to be able to analyze the events and identify the
root cause of the clusterware failure.
Oracle clusterware uses two Cluster Synchronization Service (CSS) heartbeats: network heartbeat (NHB) and disk heartbeat
(DHB) and two CSS misscount values associated with these heartbeats to detect the potential split brain conditions. The
network heartbeat crosses the private interconnect to establish and confirm valid node membership in the cluster. The disk
heartbeat is between the cluster node and the voting disk on the shared storage. Both heartbeats have their own maximal
misscount values in seconds called CSS misscount in which the heartbeats must be completed; otherwise a node eviction will
be triggered.
The CSS misscount for the network heartbeat has the following default values depending on the version of Oracle
clusterweare and operating systems:
10g (R1
OS 11g
&R2)
Linux 60 30
Unix 30 30
VMS 30 30
Windows 30 30
Table 1: Network heartbeat CSS misscout values for 11g/10g clusterware
The CSS misscount for disk heartbeat also varies on the versions of Oracle clustereware. For oracle 10.2.1 and up, the
default value is 200 seconds. Refer to [2] for details.
NODE EVICTION DIAGNOSIS CASE STUDY
When a node eviction occurs, Oracle clusterware usually records error messages into various log files. These logs files provide
the evidences and the start points for DBAs and system administrators to do troubleshooting . The following case study
illustrates a troubleshooting process based on a node eviction which occurred in a 11-node 10g RAC production database. The
symptom was that node 7 of that cluster got automatically rebooted around 11:15am. The troubleshooting started with
examining syslog file /var;log/messages and found the following error message:
Jul 23 11:15:23 racdb7 logger: Oracle clsomon failed with fatal status 12.
Jul 23 11:15:23 racdb7 logger: Oracle CSSD failure 134.
Jul 23 11:15:23 racdb7 logger: Oracle CRS failure. Rebooting for cluster integrity.
Then examined the OCSSD logfile at $CRS_HOME/log/<hostname>/cssd/ocssd.log file and found the following error
messages which showed that node 7 network heartbeat didn’t complete within the 60 seconds CSS misscount and triggered a
node eviction event:
[ CSSD]2008-07-23 11:14:49.150 [1199618400] >WARNING:
clssnmPollingThread: node racdb7 (7) at 50% heartbeat fatal, eviction in 29.720 seconds
..
clssnmPollingThread: node racdb7 (7) at 90% heartbeat fatal, eviction in 0.550 seconds
19 Session #355
20. Database
…
[ CSSD]2008-07-23 11:15:19.079 [1220598112] >TRACE:
clssnmDoSyncUpdate: Terminating node 7, racdb7, misstime(60200) state(3)
This node eviction only occurred intermittently, about once every other week. The DBAs was not able to recreate the node
eviction event. But DBAs noticed that right before the node eviction, some private IP addresses were not pingable from
other nodes. This clearly linked the root cause of the node eviction to the stability of the private interconnect. After working
with network engineers, we identified that the network instability was related to the configuration that both public network and
private interconnect shared a single physical CISCO switch. The recommended solution was to configure separate switches
dedicated to the private interconnect as shown in figure 4. After implementing the recommendation, no more node eviction
occurred again.
CRS REBOOTS TROUBLESHOOTING PROCEDURE
Besides of the node eviction caused by the failure of network heartbeat or disk heartbeat, other events may also cause CRS
node reboot. Oracle clusterware provides several processes to monitor the operations of the clusterware. When certain
conditions occurs, to protect the data integrity, these monitoring process may automatically kill the clusterware, even reboot
the node and leave some critical error messages in their log files The following lists roles of these clusterware processes in the
server reboot and where their logs are located:
Three of clusterware processes OCSSD, OPROCD and OCLSOMON can initiate a CRS reboot when they run into certain
errors:
OCSSD ( CSS daemon) monitors inter-node heath, such as the interconnect and membership of the cluster nodes. Its log file
is located in $CRS_HOME/log/<host>/cssd/ocssd.log
OPROCD(Oracle Process Monitor Daemon), introduced in 10.2.0.4, detects hardware and driver freezes that results in the
node eviction, then kills the node to prevent any IO from accessing the sharing disk. Its log file is /etc/oracle/oprocd/
<hostname>. oprocd.log
OCLSOMON process monitors the CSS daemon for hangs or scheduling issue. It may reboot the node if it sees a potential
hang. The log file is $CRS_HOME/log/<host>/cssd/oclsomon/oclsmon.log
And one of the most important log files is the syslog file, On Linux, the syslog file is /var/log/messages.
The CRS reboot troubleshooting procedure starts with reviewing various logs files to identify which of three processes above
contributes the node reboot and then isolates the root cause of this process reboot. Figure 6 troubleshooting tree or diagram
illustrated the CRS reboot troubleshooting flowchart. For further detailed troubleshooting information, refer to [3]
RAC DIAGNOSTIC TOOL S
Oracle provides several diagnostic tools to help the troubleshooting CRS reboot events such node eviction.
One of the tools is diagwait. It is very important to have the proper information in the log files for diagnostic process. This
information usually gets written into the log right before the node eviction. However during the time when the node is
evicted, the machine can be so heavily loaded that OS didn’t get a chance to write all the necessary error messages into the log
files before the node is evicted and rebooted. Diagwait is designed delay the node reboot for a short time so that OS can
write all necessary diagnostic data into the logs and but also will not increase probability of data corruption.
To setup diagwait, perform the following steps as root user.
1. Shutdown CRS on all the cluster nodes by performing crsctl stop crs command on all the nodes.
2. On one node, set diagwait by performing the command:
crsctl set css diagwait 13 –force ( set the 13 seconds waiting)
3. Restart the clusterware by running command crsctl start crs on all the nodes.
To unset the diagwait, just run the command: crsctl unset css diagwait
20 Session #355
21. Database
The second tool is Oracle problem detections tool (IPD/OS) crfpack.zip.
This tool is used to analyze the OS and clusterware resource degradation and failures related to Oracle clusterware and Oracle
RAC issues as it continuously tracks the OS resource consumptions and monitors cluster-wide operations. It can run in the
real time mode as well as in the historical mode. In the real time mode, it sends alert if certain conditions are reached. The
historical mode allows administrators to go back to the time right before the failure such as node eviction occurs and play
back the system resources and other data collected at that time. This tool works in Linux x86 with kernel greater that 2.6.9.
The tool cal be downloaded at http://www.oracle.com/technology/products/database/clustering/index.html or
http://download.oracle.com/otndocs/products/clustering/ipd/crfpack.zip
RAC-RDDT and OSwatcher are two diagnostic tools that help collect information from each node leading up to the time of the
reboot from the OS utilities such as netstat, iostat and vmstat. Refer to Metalink # 301138.1 for RAC-DDT and metalink
301137.1 for OSwatcher.
21 Session #355
23. Database
CONCLUSION
Oracle Clusterware is a critical component of Oracle RAC database and Oracle Grid infrastructure. Oracle clusterware is built
with a complex hardware technology stack which includes servers hardware, HBAs/network card, network switches, storage
switches and storage. It also utilizes some software components such as OS, multipathing drivers. The best practices of
configuring and managing and troubleshooting each of components play key roles in the stability of Oracle RAC and Oracle
Grid system. It is DBAs and system engineers’ best interest to understand and apply these best practices in their Oracle RAC
environment.
SPEAKER’S BACKGROUND
Mr. Kai Yu is a senior system consultant in Oracle solutions engineering team at Dell Inc with the specialty on Oracle Grid
Computing, Oracle RAC and Oracle E-Business Suite. He has been working with Oracle technology since 1995 and has
worked as an Oracle DBA and Oracle Applications DBA in Dell IT and other Fortune 500 companies as well as start-ups
companies. He is active in the IOUG as IOUG Collaborate 09 committee member and was elected as the US event chair for
IOUG RACSIG. He has authored many articles and presentations on Dell Power Solutions magnize and Oracle
Openworld06/07/08 and Collaborate Technology Forums and Oracle RACSIG Webcast. Kai has an M.S. in Computer
Science from the University of Wyoming. Kai can be reached at (512)725-0046, or by email at kai_yu@dell.com.
REFERENCES
[1] Oracle 10g Grid Real Applications Clusters Oracle 10g Grid Computing with RAC, Mike Ault, Madhu Tumma
[2] Oracle Metalink Note: # 294430.1, CSS Timeout Computation in Oracle Clusterware
[3] Oracle Metalink Note: # 265769.1, Troubleshooting CRS Reboots
[4] Oracle Clusterware Administration and Deployment Guide 11g Release 1 (11.1) September 2007
[5] Deploying Oracle Database 11g R1 Enterprise Edition Real Application Clusters with Red Hat Enterprise Linux 5.1 and
Oracle Enterprise Linux 5.1 On Dell PowerEdge Servers, Dell/EMC Storage, April 2008, by Kai Yu
http://www.dell.com/downloads/global/solutions/11gr1_ee_rac_on_rhel5_1__and_OEL.pdf?c=us&cs=555&l=en&s=biz
[6] Dell | Oracle Supported configurations: Oracle Enterprise Linux 5.2 Oracle 11g Enterprise Edition Deployment Guide:
http://www.dell.com/content/topics/global.aspx/alliances/en/oracle_11g_oracle_ent_linux_4_1?c=us&cs=555&l=en&s=bi
z
[7] Oracle Real Applications Clusters Internals, Oracle Openworld 2008 presentation 298713, Barb Lundhild
[8] Looking Under the hood at Oracle Clutserware, Oracle Openworld 2008 presentation 299963, Murali Vallath
23 Session #355