Rac questions


Published on

Published in: Technology, News & Politics
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Rac questions

  1. 1. Q 1)How do we find the location of OCR and voting disks?Ans: The location of OCR and Voting disks can be found as follows:OCR location:[oracle@dbcl1n1 AUCS1 ~]$ cat /etc/oracle/ocr.locO/P:ocrconfig_loc=/export/ocw/oracle/ocr1ocrmirrorconfig_loc=/export/ocw/oracle/ocr2local_only=FALSEVoting disk location:[oracle@dbcl1n1 AUCS1 ~]$ cd $ORA_CRS_HOME/bin[oracle@dbcl1n1 AUCS1 bin]$ ./crsctl query css votedisk0. 0 /export/ocw/oracle/vote11. 0 /export/ocw/oracle/vote22. 0 /export/ocw/oracle/vote3Q 3) What is cache fusion?Ans :Cache fusion is the block shipping from one instance to another instance in a RAC environment.Due to CacheFusion and the elimination of disk writes that occur when other instances request blocks for modifications, theperformance overhead to manage shared data between instances is greatly diminished. Not only do Cache Fusionsconcurrency controls greatly improve performance, but they also reduce the administrative effort for Real ApplicationClusters environments.Cache Fusion addresses several types of concurrency as described under the following headings:Concurrent Reads on Multiple NodesConcurrent Reads and Writes on Different NodesConcurrent Writes on Different NodesQ 4)Which process is responsible for cache fusion mechanism?Ans:Global cache service(GCS)A read request from an instance for a block that was modified by another instance and not yet written to disk can be arequest for either the current version of the block or for a read-consistent version. In either case, the Global CacheService Processes (LMSn) transfer the block from the holding instances cache to the requesting instances cache overthe interconnect.Q 5)If we perform a DML acivity in a 2 node RAC environment,if that node disconnect due to some reason,What will bethe result?Ans: The DML statement will execute successfully since Database is there on the shared device.Note: Thanks Junad for noticing the above answer and helping me in clarifying that in actual:We neeed to write the callout functions using FAST APPLICATION NOTIFICATION (FAN) Notifications FOR RAC.Thecallout can be written in OCI, for example JAVA for dml failover to work.http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/hafeats.htmIf you are using TAF (Transparent Application failover) with RAC Than only: session(alter session is not) failover,selectfailover,pre-connect and Basic Failover are supported.http://download.oracle.com/docs/cd/B28359_01/network.111/b28316/advcfg.htm#NETAG338
  2. 2. http://www.scribd.com/doc/19211546/Failover-for-RacIn Oracle 11g rel2:http://www.oracle.com/technetwork/database/app-failover-oracle-database-11g-173323.pdfQ 6)How do we backup OCR and Voting disk?Ans:VOTING DISK BACKUP:To make a backup copy of the voting disk, use the Linux dd command. Perform this operation on every voting disk asneeded where voting_disk_name is the name of the active voting disk and backup_file_name is the name of the file towhich you want to back up the voting disk contents:$dd if=voting_disk_name of=backup_file_nameIf your voting disk is stored on a raw device, use the device name in place of voting_disk_name. For example:dd if=/dev/sdd1 of=/tmp/voting.dmpWhen you use the dd command for making backups of the voting disk, the backup can be performed while the ClusterReady Services (CRS) process is active; you do not need to stop the crsd.bin process before taking a backup of thevoting diskOCR BACKUP:Viewing Available OCR BackupsTo find the most recent backup of the OCR, on any node in the cluster, use the following command:$ocrconfig -showbackupBacking Up the OCRBecause of the importance of OCR information, Oracle recommends that you use the ocrconfig tool to make copies ofthe automatically created backup files at least once a day.In addition to using the automatically created OCR backup files, you should also export the OCR contents to a filebefore and after making significant configuration changes, such as adding or deleting nodes from your environment,modifying Oracle Clusterware resources, or creating a database. Exporting the OCR contents to a file lets you restorethe OCR if your configuration changes cause errors. For example, if you have unresolvable configuration problems, or ifyou are unable to restart your cluster database after such changes, then you can restore your configuration byimporting the saved OCR content from the valid configuration.To export the contents of the OCR to a file, use the following command, where backup_file_name is the name of theOCR backup file you want to create:$ocrconfig -exportbackup_file_nameQ 7)What is global cache service,global enque service and global resource directory?Ans: Global Cache ServiceGCS is the main controlling process that implements Cache Fusion. GCS tracks the location and the status (mode androle) of the data blocks, as well as the access privileges of various instances. GCS is the mechanism, which guaranteesthe data integrity by employing global access levels. GCS maintains the block modes for data blocks in the global role.It is also responsible for block transfers between the instances. Upon a request from an Instance GCS organizes theblock shipping and appropriate lock mode conversions. The Global Cache Service is implemented by various backgroundprocesses, such as the Global Cache Service Processes (LMSn) and Global Enqueue Service Daemon (LMD).
  3. 3. Global Enqueue ServiceThe Global Enqueue Service (GES) manages or tracks the status of all the Oracle enqueuing mechanisms. This involvesall non Cache-fusion intra-instance operations. GES performs concurrency control on dictionary cache locks, librarycache locks and the transactions. GES does this operation for resources that are accessed by more than one instance.GES/GCS AreasGES and GCS have the memory structures associated with global resources. It is distributed across all instances in acluster. This area is located in the variable or shared pool section of the SGA. The list below shows the additions.Global Resource Directory Global Resource Directory (GRD) is the internal database that records and stores thecurrent status of the data blocks. Whenever a block is transferred out of a local cache to another instance’s cache, theGRD is updated. The following resources information is available in GRD.* Data Block Identifiers (DBA)* Location of most current version* Modes of the data blocks ( (N)Null, (S)Shared, (X)Exclusive )*The Roles of the data blocks (local or global) held by each instance*Buffer caches on multiple nodes in the clusterWhat are Oracle Clusterware processes for 10g on Unix and LinuxCluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure ofthis process results in cluster restart.Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, aservice, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resources configurationinformation that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as theroot userEvent manager daemon (evmd) —A background process that publishes events that crs creates.Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs itscheck, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and rebootsthe node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer onLinux platforms.RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources.Runs server callout scripts when FAN events occur.What are Oracle database background processes specific to RAC•LMS—Global Cache Service Process•LMD—Global Enqueue Service Daemon•LMON—Global Enqueue Service Monitor•LCK0—Instance Enqueue ProcessTo ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction,Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). TheGCS and GES maintain records of the statuses of each data file and each cached block using a Global ResourceDirectory (GRD). The GRD contents are distributed across all of the active instances.
  4. 4. What are Oracle Clusterware ComponentsVoting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitratescluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as configuration information aboutany cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in yourclusterHow do you troubleshoot node rebootPlease check metalink ...Note 265769.1 Troubleshooting CRS RebootsNote.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.How do you backup the OCRThere is an automatic backup mechanism for OCR. The default location is : $ORA_CRS_HOMEcdata"clustername"To display backups :#ocrconfig -showbackupTo restore a backup :#ocrconfig -restoreWith Oracle RAC 10g Release 2 or later, you can also use the export command:#ocrconfig -export -s online, and use -import option to restore the contents back.With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the command:# ocrconfig -manualbackupHow do you backup voting disk#dd if=voting_disk_name of=backup_file_nameHow do I identify the voting disk location#crsctl query css votediskHow do I identify the OCR file locationcheck /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)or#ocrcheckIs ssh required for normal Oracle RAC operation ?"ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled for Oracle RAC and patchsetinstallation.What is SCAN?Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature thatprovides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCANdo not need to change if you add or remove nodes in the cluster.Click here for more details from Oracle
  5. 5. What is the purpose of Private Interconnect ?Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communicationbetween the the clustered nodes. This communication is based on the TCP protocol.RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remotememory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.Why do we have a Virtual IP (VIP) in Oracle RAC?Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be upto 10 min) before getting an error. As a result, you dont really have a good HA solution without using VIPs.When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps theworld indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will senderror RST packets back to the clients. This results in the clients getting errors immediately.What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report?This is most likely due to a fault in interconnect network.Check netstat -sif you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault withnetwork.How many nodes are supported in a RAC Database?10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database.Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes?How do you identify the problem?Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed errorstack.what is the purpose of the ONS daemon?The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps.There is one ons daemon started per clustered node.The Oracle Notification Service daemon receive a subset of published clusterware events via the local evmd andracgimon clusterware daemons and forward those events to application subscribers and to the local listeners.This in order to facilitate:a. the FAN or Fast Application Notification feature or allowing applications to respond to database state changes.b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross different rac nodes dependent ofthe load on the different nodes. The rdbms MMON is creating an advisory for distribution of work every 30seconds andforward it via racgimon and ONS to listeners and applications.
  6. 6. What is RAC?RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation that ensures high availabilityof databases by providing instance failover, media failover features.What is RAC and how is it different from non RAC databases?RAC stands for Real Application Cluster, you have n number of instances running in their own separate nodes and basedon the shared storage. Cluster is the key component and is a collection of servers operations as one unit. RAC is thebest solution for high performance and high availably. Non RAC databases has single point of failure in case ofhardware failure or server crash.Give the usage of srvctl :srvctl start instance -d db_name -i "inst_name_list" [-o start_options]srvctl stop instance -d name -i "inst_name_list" [-o stop_options]srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediatesrvctl start database -d name [-o start_options]srvctl stop database -d name [-o stop_options]srvctl start database -d orcl -o mountMention the Oracle RAC software components :Oracle RAC is composed of two or more database instances. They are composed of Memory structures and backgroundprocesses same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service),GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following backgroundprocesses:ACMS—Atomic Controlfile to Memory Service (ACMS)GTX0-j—Global Transaction ProcessLMON—Global Enqueue Service MonitorLMD—Global Enqueue Service DaemonLMS—Global Cache Service ProcessLCK0—Instance Enqueue ProcessRMSn—Oracle RAC Management Processes (RMSn)RSMN—Remote Slave MonitorWhat is GRD?GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and eachcahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity.What are the different network components are in 10g RAC?public, private, and vip componentsPrivate interfaces is for intra node communication. VIP is all about availability of application. When a node fails thenthe VIP component fail over to some other node, this is the reason that all applications should based on vipcomponents means tns entries should have vip entry in the host listGive Details on ACMS:ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures adistributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of afailure.
  7. 7. What is Cache Fusion?Cache fusion is the mechanism to transfer the data block from memory to memory of one node to the other.If twonodes require the same block for query or update, the block must be transfered from the cache of one node to theother. RAC system must equipped with low-latency and high speed inter-connect to make it happen.Give Details on Cache Fusion:Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within thecluster and another instance is in need of the same block,it is easy to get the block image from the insatnce which hasthe block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes useof interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process manages the cahce fusion.Cache Fusion is essentially a memory-to-memory transfer of data between the nodes in the RAC environment. BeforeCache Fusion, a node was required to write some of the data to disk before it could be transferred to the next node inthe cluster. Cache Fusion does a straight memory-to-memory transfer. In addition, each nodes SGA has a map of whatdata is contained in the other nodes data caches.The performance improvement is phenomenal. Oracle leverages the vendors high speed interconnects between thenodes to achieve the cache-to-cache data transfers. Before Cache Fusion, when you added a node to the cluster toincrease performance of the application, it didnt always provide you with the performance improvement that youhoped for. With Cache Fusion, you can easily cost justify the addition of another node into a RAC cluster to increasethe performance of the application running on it. Oracle sales pitches describe it as near linear horizontal scalability.What are the major RAC wait events?In a RAC environment the buffer cache is global across all instances in the cluster and hence the processing differs.Themost common wait events related to this are gc cr request and gc buffer busyGC CR request :the time it takes to retrieve the data from the remote cacheReason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase the amount of datablocks requested by an Oracle session. The more blocks requested typically means the more often a block will need tobe read from a remote instance via the interconnect.)GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested data block.Give details on GTX0-j :The process provides transparent support for XA global transactions in a RAC environment.The database autotunes thenumber of these processes based on the workload of XA global transactions.Give details on LMON:This process monitors global enques and resources across the cluster and performs global enqueue recoveryoperations.This is called as Global Enqueue Service Monitor.Give details on LMD:This process is called as global enqueue service daemon. This process manages incoming remote resource requestswithin each instance.Give details on LMS:This process is called as Global Cache service process.This process maintains statuses of datafiles and each cahed blockby recording information in a Global Resource Dectory(GRD).This process also controls the flow of messages to remoteinstances and manages global data block access and transmits block images between the buffer caches of differentinstances.This processing is a part of cache fusion feature.
  8. 8. Give details on LCK0:This process is called as Instance enqueue process.This process manages non-cache fusion resource requests such aslibry and row cache requests.Give details on RMSn:This process is called as Oracle RAC management process.These pocesses perform managability tasks for OracleRAC.Tasks include creation of resources related Oracle RAC when new instances are added to the cluster.Give details on RSMN:This process is called as Remote Slave Monitor.This process manages background slave process creation anddcommunication on remote instances. This is a background slave process.This process performs tasks on behalf of a co-ordinating process running in another instance.What components in RAC must reside in shared storage?All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.What is the significance of using cluster-aware shared storage in an Oracle RAC environment?All instances of an Oracle RAC can access all the datafiles,control files, SPFILEs, redolog files when these files arehosted out of cluster-aware shared storage which are group of shared disks.Give few examples for solutions that support cluster storage:ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2 and OCFS(Oracle Cluster Fiesystems).What is an interconnect network?An interconnect network is a private network that connects all of the servers in a cluster. The interconnect networkuses a switch/multiple switches that only the nodes in the cluster can access.How can we configure the cluster interconnect?Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unix and linux systems we useUDP and RDS(Reliable data socket) protocols to be used by Oracle Clusterware.Windows clusters use the TCP protocol.Can we use crossover cables with Oracle Clusterware interconnects?No, crossover cables are not supported with Oracle Clusterware intercnects.What is the use of cluster interconnect?Cluster interconnect is used by the Cache fusion for inter instance communication.How do users connect to database in an Oracle RAC environment?Users can access a RAC database using a client/server configuration or through one or more middle tiers ,with orwithout connection pooling.Users can use oracle services feature to connect to database.What is the use of a service in Oracle RAC environment?Applications should use the services feature to connect to the Oracle database.Services enable us to define rules andcharacteristics to control how users and applications connect to database instances.
  9. 9. What are the characteristics controlled by Oracle services feature?The charateristics include a unique name, workload balancing and failover options,and high availability characteristics.What enables the load balancing of applications in RAC?Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RACdatabase.What is a virtual IP address or VIP?A virtl IP address or VIP is an alternate IP address that the client connectins use instead of the standard public IPaddress. To configureVIP address, we need to reserve a spare IP address for each node, and the IP addresses must usethe same subnet as the public network.What is the use of VIP?If a node fails, then the nodes VIP address fails over to another node on which the VIP address can accept TCPconnections but it cannot accept Oracle connections.Give situations under which VIP address failover happens:VIP addresses failover happens when the node on which the VIP address runs fails, all interfaces for the VIP addressfails, all interfaces for the VIP address are disconnected from the network.What is the significance of VIP address failover?When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connectionrefused error .They dont have to wait for TCP connection timeout messages.What are the administrative tools used for Oracle RAC environments?Oracle RAC cluster can be administered as a single image using OEM(EnterpriseManager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NETCAHow do we verify that RAC instances are running?Issue the following query from any one node connecting through SQL*PLUS.$connect sys/sys as sysdbaSQL>select * from V$ACTIVE_INSTANCES;The query gives the instance number under INST_NUMBER column,host_:instancename under INST_NAME column.What is FAN?Fast application Notification as it abbreviates to FAN relates to the events related to instances,services and nodes.Thisis a notification mechanism that Oracle RAc uses to notify other processes about the configuration and service levelinformation that includes service status changes such as,UP or DOWN events.Applications can respond to FAN eventsand take immediate action.Where can we apply FAN UP and DOWN events?FAN UP and FAN DOWN events can be applied to instances,services and nodes.State the use of FAN events in case of a cluster configuration change?During times of cluster configuration changes,Oracle RAC high availability framework publishes a FAN eventimmediately when a state change occurs in the cluster.So applications can receive FAN events and reactimmediately.This prevents applications from polling database and detecting a problem after such a state change.Why should we have seperate homes for ASm instance?It is a good practice to have ASM home seperate from the database hom(ORACLE_HOME).This helps in upgrading and
  10. 10. patching ASM and the Oracle database software independent of each other.Also,we can deinstall the Oracle databasesoftware independent of the ASM instance.What is the advantage of using ASM?Having ASM is the Oracle recommended storage option for RAC databases as the ASM maximizes performance bymanaging the storage configuration across the disks.ASM does this by distributing the database file across all of theavailable storage within our cluster database environment.What is rolling upgrade?It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g release(from 11.1) can be upgraded orpatched using rolling upgrade feature. This enables us to patch or upgrade ASM nodes in a clustered environmentwithout affecting database availability.During a rolling upgrade we can maintain a functional cluster while one or moreof the nodes in the cluster are running in different software versions.Can rolling upgrade be used to upgrade from 10g to 11g database?No,it can be used only for Oracle database 11g releases(from 11.1).State the initialization parameters that must have same value for every instance in an Oracle RAC database:Some initialization parameters are critical at the database creation time and must have same values.Their value mustbe specified in SPFILE or PFILE for every instance.The list of parameters that must be identical on every instance aregiven below:ACTIVE_INSTANCE_COUNTARCHIVE_LAG_TARGETCOMPATIBLECLUSTER_DATABASECLUSTER_DATABASE_INSTANCECONTROL_FILESDB_BLOCK_SIZEDB_DOMAINDB_FILESDB_NAMEDB_RECOVERY_FILE_DESTDB_RECOVERY_FILE_DEST_SIZEDB_UNIQUE_NAMEINSTANCE_TYPE (RDBMS or ASM)PARALLEL_MAX_SERVERSREMOTE_LOGIN_passWORD_FILEUNDO_MANAGEMENTWhat is ORA-00603: ORACLE server session terminated by fatal error or ORA-29702: error occurred in ClusterGroup Service operation?RAC node name was listed in the loopback address...Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?These parameters can be identical on all instances only if these parameter values are set to zero.What two parameters must be set at the time of starting up an ASM instance in a RAC environment?The parametersCLUSTER_DATABASE and INSTANCE_TYPE must be set.Mention the components of Oracle clusterware:Oracle clusterware is made up of components like voting disk and Oracle Cluster Registry(OCR).What is a CRS resource?Oracle clusterware is used to manage high-availability operations in a cluster.Anything that Oracle Clusterware
  11. 11. manages is known as a CRS resource.Some examples of CRS resources are database,an instance,a service,a listener,aVIP address,an application process etc.What is the use of OCR?Oracle clusterware manages CRS resources based on the configuration information of CRS resources stored inOCR(Oracle Cluster Registry).How does a Oracle Clusterware manage CRS resources?Oracle clusterware manages CRS resources based on the configuration information of CRS resources stored inOCR(Oracle Cluster Registry).Name some Oracle clusterware tools and their uses?OIFCFG - allocating and deallocating network interfacesOCRCONFIG - Command-line tool for managing Oracle Cluster RegistryOCRDUMP - Identify the interconnect being usedCVU - Cluster verification utility to get status of CRS resourcesWhat are the modes of deleting instances from ORacle Real Application cluster Databases?We can delete instances using silent mode or interactive mode using DBCA(Database Configuration Assistant).How do we remove ASM from a Oracle RAC environment?We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removedusing srvctl tool as follows:srvctl stop asm -n node_namesrvctl remove asm -n node_nameWe can verify if ASM has been removed by issuing the following command:srvctl config asm -n node_nameHow do we verify that an instance has been removed from OCR after deleting an instance?Issue the following srvctl command:srvctl config database -d database_namecd CRS_HOME/bin./crs_statHow do we verify an existing current backup of OCR?We can verify the current backup of OCR using the following command : ocrconfig -showbackupWhat are the performance views in an Oracle RAC environment?We have v$ views that are instance specific. In addition we have GV$ views called as global views that has an INST_IDcolumn of numeric data type.GV$ views obtain information from individual V$ views.What are the types of connection load-balancing?There are two types of connection load-balancing:server-side load balancing and client-side load balancing.What is the difference between server-side and client-side connection load balancing?Client-side balancing happens at client side where load balancing is done using listener.In case of server-side loadbalancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.What are the three greatest benefits that RAC provides??The three main benefits are availability, scalability, and the ability to use low cost commodity hardware. RAC allowsan application to scale vertically, by adding CPU, disk and memory resources to an individual server. But RAC alsoprovides horizontal scalability, which is achieved by adding new nodes into the cluster. RAC also allows an organization
  12. 12. to bring these resources online as they are needed. This can save a small or midsize organization a lot of money in theearly stages of a project.In a RAC environment, if a node in the cluster fails, the application continues to run on the surviving nodes containedin the cluster. If your application is configured correctly, most users wont even know that the node they were runningon became unavailable.What are the major RAC wait events?In a RAC environment the buffer cache is global across all instances in the cluster and hence the processingdiffers.The most common wait events related to this are gc cr request and gc buffer busyGC CR request: the time it takes to retrieve the data from the remote cacheReason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase the amount of datablocksrequested by an Oracle session. The more blocks requested typically means the more often a block will need to be readfrom a remote instance via the interconnect.)GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested data block.What are the different network components in Oracle 10g RAC?We have public, private, and VIP components. Private interfaces is for intra node communication. VIP is all aboutavailability of application. When a node fails then the VIP component will fail over to some other node, this is thereason that all applications should be based on VIP components. This means that tns entries should have VIP entry inthe host list.
  13. 13. Oracle RAC load balancing and failoverLOAD BALANCING: The Oracle RAC system can distribute the load over many nodes this feature called as load balancing.There are two methods of load balancing1.Client load balancing2.Server load balancingClient Load Balancing distributes new connections among Oracle RAC nodes so that no one server is overloaded withconnection requests and it is configured at net service name level by providing multiple descriptions in a description listor multiple addresses in an address list. For example, if connection fails over to another node in case of failure, the clientload balancing ensures that the redirected connections are distributed among the other nodes in the RAC.Configure Client-side connect-time load balancing by setting LOAD_BALANCE=ON in the corresponding client side TNSentry.TESTRAC =(DESCRIPTION =(ADDRESS_LIST=(LOAD_BALANCE = ON)(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521)))(CONNECT_DATA = (SERVICE_NAME = testdb.oracleracexpert.com)))Server Load Balancing distributes processing workload among Oracle RAC nodes. It divides the connection load evenlybetween all available listeners and distributes new user session connection requests to the least loaded listener(s) basedon the total number of sessions which are already connected. Each listener communicates with the other listener(s) viaeach database instance’s PMON process.Configure Server-side connect-time load balancing feature by setting REMOTE_LISTENERS initialization parameter ofeach instance to a TNS name that describes list of all available listeners.TESTRAC_LISTENERS =(DESCRIPTION =(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1)(PORT = 1521)))(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2)(PORT = 1521)))))Set *.remote_listener= TESTRAC_LISTENERS’ initialization parameter in the database’s shared SPFILE and addTESTRAC_LISTENERS’ entry to the TNSNAMES.ORA file in the Oracle Home of each node in the cluster.Once you configure Server-side connect-time load balancing, each database’s PMON process will automatically registerthe database with the database’s local listener as well as cross-register the database with the listeners on all othernodes in the cluster. Now the nodes themselves decide which node is least busy, and then will connect the client to thatnode.
  14. 14. FAILOVER:The Oracle RAC system can protect against failures caused by O/S or server crashes or hardware failures. When a nodefailure occurs in RAC system, the connection attempts can fail over to other surviving nodes in the cluster this featurecalled as Failover.There are two methods of failover1. Connection Failover2. Transparent Application Failover (TAF)Connection Failover - If a connection failure occurs at connect time, the application failover the connection to anotheractive node in the cluster. This feature enables client to connect to another listener if the initial connection to the firstlistener fails.Enable client-side connect-time Failover by setting FAILOVER=ON in the corresponding client side TNS entry.TESTRAC =(DESCRIPTION =(ADDRESS_LIST=(LOAD_BALANCE = ON)(FAILOVER = ON)(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521)))(CONNECT_DATA = (SERVICE_NAME = testdb.oracleracexpert.com)))If LOAD_BALANCE is set to on then clients randomly attempt connections to any nodes. If client made connectionattempt to a down node, the client needs to wait until it receives the information that the node is not accessible beforetrying alternate address in ADDRESS_LIST.Transparent application Failover (TAF) – If connection failure occurs after a connection is established, the connectionfails over to other surviving nodes. Any uncommitted transactions are rolled back and server side program variables andsession properties will be lost. In some case the select statements automatically re-executed on the new connectionwith the cursor positioned on the row on which it was positioned prior to the failover.TESTRAC =(DESCRIPTION =(LOAD_BALANCE = ON)(FAILOVER = ON)(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))(CONNECT_DATA =(SERVICE_NAME = testdb.oracleracexpert.com)(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5))
  15. 15. ))Daily DBA Checklist• Ensure that previous nights backup is complete and there are no RMAN errors in the backup logs.• Ensure that any exports which are part of the backup are complete and the dump files compressed.• Check the alert log for any ORA- errors - also for messages like Checkpoint not complete etc.• Ensure that the cron job for truncating, saving and renaming alert logs is working - verify the same.• Ensure that the archive redo log files are compressed and have been deleted. Only files for current and previous dayshould be present.• All tablespaces should be less than 95% full - run the coalesce command on all tablespaces to reduce fragmentation.Ensure that space in the TEMP tablespace is released and is 100% free at the beginning of the day.• Enough contiguous free space is available in all tablespaces for objects to extend if required.• Backup the control file to trace so that every day we have a outline of the files and their locations for each database.• No objects are within 5 extents of the MAXEXTENTS storage parameter.• All core dumps are deleted from the $CDUMP area.• All *.trc files are deleted from the $UDUMP area.• Check the machine for any disks 100% full or nearing that value. If a disk has filled up use the find command todetermine files which have been recently created/modified. Ensure that all *.dmp files are in their proper locationsand large *.dmp files have been compressed.• Truncate the listener.log file in the $ORACLE_HOME/network/log location if the listener log has increased to a size >than 500 MB. Ensure the space is released, otherwise reload listener.• Run the recently created/modified objects report to ensure that no unauthorised object creation/modification istaking place.
  16. 16. • Ensure that there are no DBMS_JOBS with the status of failed or broken. Also last refresh times of all running jobsshould be current.• Check to ensure that no objects exist in the database with the status INVALID