RAC Best Practices on Linux Kirk McGowan Technical Director – RAC Pack Server Technologies Oracle Corporation Session id: 40136 Roland Knapp Principal Member Technical Staff – RAC Pack Server Technologies Oracle Corporation
Scalability by making additional processing capacity available incrementally
Private interconnect/network switch
Shared storage/concurrent access/storage switch
OS, Cluster Manager, DBMS/RAC, Application
Differences between cluster managers
RAC Hardware Architecture Clustered Database Servers Mirrored Disk Subsystem High Speed Switch or Interconnect Hub or Switch Fabric Network Centralized Management Console Storage Area Network Low Latency Interconnect ie. VIA or Proprietary Users No Single Point Of Failure Shared Cache
RAC Software Architecture Shared Disk Database Shared Data Model Shared Memory/Global Area shared SQL log buffer . . . . . . Shared Memory/Global Area shared SQL log buffer Shared Memory/Global Area shared SQL log buffer Shared Memory/Global Area shared SQL log buffer GES&GCS GES&GCS GES&GCS GES&GCS
RAC on Linux HW & SW Components public network Node1a shared storage redo log instance 1 … redo log instance 2 … control files database files Node2a cluster interconnect cache to cache N3 N4 Nn concurrent access from every node = “scale out” more nodes = higher availability Unbreakable Linux Unbreakable Linux ORACM ORACM Oracle 9 i RAC instance 1 Oracle 9 i RAC instance 2 DB cache DB cache
Linux Cluster Hardware
FastEthernet, Gigabit Ethernet
Ethernet, FastEthernet, Gigabit Ethernet
Memory, swap & CPU Recommendations
Each server should have a minimum of 512Mb of memory, at least 1Gb swap space, and two CPUs.
Fiber Channel, SCSI, or NAS storage connectivity
Unbreakable Linux Distributions
Red Hat Enterprise Linux AS and ES
United Linux 1.0
SuSE Linux Enterprise Server 8 (SuSE Linux AG)
Conectiva Linux Enterprise Edition (Conectiva S.A.)
SCO Linux Server 4.0 (The SCO Group)
Turbolinux Enterprise Server 8 (Turbolinux)
Oracle will support Oracle products running with other distributions but will not support the operating system.
RAC Certification for Unbreakable Linux
Enterprise class OS distribution (e.g. RH AS, United Linux 1.0)
Installation Flowchart for Red Hat Linux AS 2.1 Boot Choose Language Select Keyboard & Mouse Choose – Advanced Server Option Use DRUID for Partition Setup Select Boot Loader Configure Network Configure Timezone Account Configuration Select Graphic Mode Boot Floppy Creation Installation Complete / Reboot
Install tips for Red Hat Linux AS 2.1
As documented in:
“ Tips and Techniques: Install and Configure Oracle9i on Red Hat Linux Advanced Server” by Deepak Patel, Oracle http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf
Always use Advanced Server install. As needed install required packages. CD 1 to 3 has all rpm packages. CD 3 and 4 has source packages. CD 5 includes docs.
Based on physical memory on machine smp or enterprise kernel is installed. ( <= 4 GB smp kernel and > 4 GB enterprise kernel )
Add users, configure network and other administrative tasks after installation.
Install tips for United Linux 1.0
You must install the latest UnitedLinux kernel update! Oracle was certified against an update kernel, the original UL-1.0 kernel is NOT certified!
After installing United Linux 1.0, install Service Pack 2a from:
Establish file system or location for ORACLE_HOME (writable for oracle userid)
Setup host equivalence for oracle userid (.rhosts)
Installation Flowchart for OCFS Install the rpm’s on all nodes Run ocfstool as root (configures /etc/ocfs.conf) on all nodes Run load_ocfs (insmod will load ocfs.o) on all nodes Create partition on the primary node Run ocfstool to format and mount your new filesystem Mount the new filesystem on all nodes Edit rc.local or equivalent add load_ocfs and ‘mount –t ocfs <device> <mountpoint’ Download the latest OCFS rpm’s from www. ocfs .org
OCFS and Unbreakable Linux
currently ships 4 flavors of the AS 2.1 kernel, viz., UP, SMP, Enterprise and Summit (IBM x440)
Oracle provides a separate OCFS module for each of the kernel flavors
Minor revisions of the kernel do not need a fresh build of ocfs
e.g., ocfs built for e.12 will work for e.16, e.18, etc.
United Linux ships 3 flavors of its kernel, for the 2.4.19-64GB-SMP , the 2.4.19-4GB and the 2.4.19-4GB-SMP kernel
OCFS 1.0.9 is supported on UL 1.0 Service Pack 2a or higher
OCFS build is not currently upward compatible with kernel (pre SP3) must ensure OCFS build exists for each new Kernel version prior to upgrading kernel
OCFS and RAC
Maintains cache coherency across nodes for the filesystem metadata only
Does not synchronize the data cache buffers across nodes, lets RAC handle that
OCFS journals filesystem metadata changes only
Filedata changes are journalled by RAC (log files)
Overcomes some limitations of raw devices on Linux
No limit on number of files
Allows for very large files (max 2TB)
Max volume size 32G (4K block) to 8T (1M block)
Oracle DB performance is comparable to raw
kernel e.25 is strongly recommended for use with OCFS 1.0.9 (remove old kernel tuning parameters)
Install Tips for OCFS
Ensure OCFS rpm corresponds to kernel version
uname –r (i.e. 2.4.19-4GB)
Remember to also download rpm’s for OCFS “Support Tools” and “Additional Tools”
Download the dd/tar/cp rpm that supports o_direct
Use rpm –Uv to install all 4 rpm’s on all nodes
Use OCFS for Oracle DB files only, not Oracle binaries (OCFS 1.0.x was not designed as a general purpose filesystem).
Installation Flowchart for oracm and Oracle Install the oracm from the 188.8.131.52 CD-ROM Configure ocmargs.ora and cmcfg.ora Load the softdog and start with ./ocmstart.sh the cluster manager on both nodes Install 184.108.40.206 software with the RAC option Kill the oracm and watchdog process modify ocmargs.ora and cmcfg.ora (remove watchdog) Load the hangcheck-timer module with lsmod Install the oracm from the 220.127.116.11 patchset Start with ./ocmstart.sh the cluster manager Install the 18.104.22.168 patchset Configure private interconnect and quorum device Fix empty directory bug
Hangcheck NM, and CM Flow (After V22.214.171.124) Oracle Instance Cluster Manager (including Node Monitor) Hangcheck-timer User-mode Kernel-mode Oracm maintains both, node status view and instance status view. The hangcheck-timer monitors the kernel for hangs, and resets the node if needed.
To enable asynchronous I/O must re-link Oracle to use skgaioi.o
Adjust UDP send / receive buffer size to 256K
Larger Buffer Cache
Create an in-memory file system on the /dev/shm (mount -t shm shmfs -o size=8g /dev/shm)
To enable the extended buffer cache feature, set the init.ora paramter USE_INDIRECT_DATA_BUFFERS = true
Increasing Address Space
Default 1.7 GB of address space for its SGA.
See Metalink Note: 200266.1 for details and a sample program.
Create RAC database using DBCA
Use DBCA to simplify DB creation
Start gsd ( global services daemon ) on all nodes, if it is not already running.
Set MAXINSTANCES, MAXLOGFILES, MAXLOGMEMBERS, MAXLOGHISTORY, MAXDATAFILES (auto with DBCA)
Create tablespaces as locally Managed (auto with DBCA)
Create all tablespaces with ASSM (auto with DBCA)
Configure automatic UNDO management (auto with DBCA)
Use SPFILE instead of multiple init.ora’s (auto with DBCA)
SCO Group (Formerly Caldera System) - http://www.ebizenterprises.com/page1.asp?p=463
Recommended one-off patches
Bug 2820871 - ORA-29740 NODE EVICTION DESIGN ALGORITHM AND ABRUPT TIME CHANGE ARU: 126.96.36.199 ARU 4161735 completed for LINUX Intel
Bug 2420930 - GET ORA-600 [KSXPMPRP1] DURING STARTUP IN RAC MODE WITH LARGER BUFFERS. This was mysteriously included in 188.8.131.52, but not in 184.108.40.206. Bug 2875050 was opened for this issue. ARU: 220.127.116.11 ARU 4202164completed for LINUX Intel
Bug 2420930 - GET ORA-600 [KSXPMPRP1] DURING STARTUP IN RAC MODE WITH LARGER BUFFERS Bug 2922471 – Fractured block found during crash/instance recovery. Not an Oracle bug. Do not use ‘intr’ for the mount option.
Recommended one-off patches
Bug:2844009 - MISSING LIBCXA.SO.3 LIBRARY ISSUE IN PSR 9203. ARU: 18.104.22.168 ARU 4046387 completed for LINUX Intel
Bug 2779294 – node_list does not populated into oraInventory/ContentsXML/inventory.xml. opatch install will only apply to local node. Workaround is editing inventory.xml documented in bug 2742686.
Bug 2646914, 2675090, 2706220 and 2695783 - ORA-600 [KCCSBCK_FIRST],  on linux and W2K platform after installing 22.214.171.124. Very important patch, missing from 126.96.36.199 ARU: 188.8.131.52 ARU 4110670 completed for LINUX Intel ·
Hangcheck-timer and Oracle Cluster Manager
Download Patch 2594820 from Metalink
#rpm -ivh <hangcheck-timer RPM name>
Detaching watchdogd from the Cluster Manager (Bug 2495915)
The removal of the watchdogd
CMDiskFile from optional to mandatory
CM quorum partition of cluster participation.
Hangcheck-timer and Oracle Cluster Manager
remove or comment out from the /etc/rc.local file:
Find out the pid of the process (shell) from where oracle will be started using ps (Oracle - echo $$)
changing /proc/$pid/mapped_base to 0x10000000 and restarting oracle
Metalink Note: 200266.1
Post Installation Variable SGA Reserved for kernel DB Buffers (SGA) Default Code, etc. 0xFFFFFFFF 0xC0000000 0x50000000 0x40000000 0x00000000 Variable SGA Reserved for kernel DB Buffers (SGA) After Relink Code, etc. 0xFFFFFFFF 0xC0000000 0x15000000 0x10000000 0x00000000 mapped_base (/proc/<pid>/mapped_base) sga_base (relink Oracle) Lowering of mapped base
Larger Buffer Cache does buffer cache increase with larger SGA
Create an in-memory file system on the /dev/shm
mount -t shm shmfs -o size=8g /dev/shm
To enable the extended buffer cache feature, set the init.ora paramter
USE_INDIRECT_DATA_BUFFERS = true
Don’t Use dynamic cache parameters
Limitations apply to the extended buffer cache feature on Linux:
You cannot change the size of the buffer cache while the instance is running.
You cannot create or use tablespaces with non-standard block sizes.