• Like
Collaborate 2012 - RMAN eliminate the mystery
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Collaborate 2012 - RMAN eliminate the mystery

  • 1,999 views
Published

White paper de la presentación dada en Collaborate 2012 - http://coll12.mapyourshow.com/5_0/sessions/sessiondetails.cfm?ScheduledSessionID=10ABCF

White paper de la presentación dada en Collaborate 2012 - http://coll12.mapyourshow.com/5_0/sessions/sessiondetails.cfm?ScheduledSessionID=10ABCF

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,999
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
142
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. High Availability Boot Camp: RMAN - Eliminate the Mystery RMAN - ELIMINATE THE MYSTERY Nelson Calero, UYOUGRMAN is a must-have tool familiar to most Oracle DBAs. There are many capabilities hidden inRMAN and going beyond the basics is important.After a quick review of the core functionality, we will discuss both old and new features andprovide tips and tricks along with sample code using the Command Line Interface (CLI).INTRODUCTIONOne of the most important tasks of the DBA is to ensure that production data are safe, althoughincidents can occur and compromise hardware, software or physical facilities. Within our DisasterRecovery and High Availability plans, one of the most important components are the backups.RMAN functionality is implemented in the Oracle kernel, with many features to use in backup andrecovery tasks, available through a PL/SQL API. The RMAN client is a command line utilityincluded in the installation of the Oracle database, in the $ORACLE_HOME/bin directory, writtenin Pro*C that executes the PL/SQL API. Also there are some functionalities accessible throughEnterprise Manager, and the possibility to create your own scripts calling the PL/SQL API.The most important tasks we can perform with RMAN are: • backup • recovery • failure diagnostic (from 11.1, Data Recovery Advisor functionality) • instance duplication • backups historyRMAN provides several additional features in addition to the obvious tasks expected of backupsand recovery. From validating the logical and physical integrity of the backups to sophisticatedtasks in a single command as the duplication of an instance. It also presents a lot of maturity,where there have been resolved many errors (bugs): over 170 in all versions (ref:support.oracle.com), and with few known issues remaining in the current version (MOS Note247611.1).Additionally, each new version of the database incorporate new functionaliies or improvesexisting ones. For example, 11 improvements are included in version 11.2 and 21 in version 11.1(documented in the "new features" section of each manual).Although RMAN has been available since version 8 of the database (CLI mode), its adoption byusers has been low compared to the alternative of developing scripts that perform backupsmanually.This article will show the advantages of using RMAN, with examples to motivate those who donot use it yet, and help leverage their capabilities to those who already use it. 1 Session #492
  • 2. High Availability Boot Camp: RMAN - Eliminate the MysteryBACKUP POLICIES?RMAN is a tool that can be used for specific actions or to implement tasks from a backup policy.These policies are the ones who ensure that sensitive information of our organization is safe, andtherefore include processes and procedures that exceed the use of a particular tool.As an example, these include the transfer of backups to external sites, the rotation frequency ofthe media used to store the backups, use of encryption techniques to ensure that data is notreadable by third parties in case of media devices get lost, use of additional features of thedatabase to have redundant data to be used in case of failure, etc..The DataPump utility, functionality such as Flashback and Secure Backup of the Oracle Databaseare other ingredients to include in a backup policy, that we will not see here.ORACLE BACKUPSThe database has several files: data (datafiles), control (controlfiles), temporal (TempFiles), undoand redologs. To maintain consistency, it uses internally a number which identify transactions,called System Change Number (SCN).(http://docs.oracle.com/cd/E11882_01/server.112/e25789/transact.htm#CHDIAFFH)This number is stored in several of these files. During the opening of the database is analyzedand action is taken to ensure consistent data exist only with the last commited transaction,which corresponds to an SCN. (For more details on this operation, see note 1376995.1Information on the System Change Number (SCN) and how it is Used in the Oracle Database).To take a backup while the database is down, the procedure is simple: just copy all the files thatcomposes the database to the new destination. This backup can be used later with the sameresult we have on the original installation when opening the database after the backup, If it isconsistent, it will be consistent. If it needs to be recovered, we need to make sure we have allthe necesary logs included on this copy at the destination site.This procedure is known as cold backup. It is difficult to have the option of using it in productionbecause it implies lack of service (down time) while it runs and that may be diffcult to arrange.When the database is running, any activity of internal processes (background) and user (serverprocesses) can change data files. Then, copying those files from the operating system (using cp /copy) can generate copies that are not identical to the originals one, since the database blockscan be written (by the DBWR process) when trying to be read by the cp command, generatingthe problem known as Fractured Block(http://docs.oracle.com/cd/E11882_01/backup.112/e10642/glossary.htm#CHDBFCBF). Thisoccurs because the OS write operations are not necessarily of the same size that the blocks usedby the database (typical disk blocks are 512 bytes vs database blocks of 8 Kbytes), thereforethere may be several non-atomic input/output operations for each Oracle data blocks.Even if using ASM this copy is possible, and can run into the same problem, though with differentmethods depending on the version, because from 11.1 the command cp exists within theASMCMD utility.To avoid this problem while copying, we must tell the database that we are doing this operationthrough the command ALTER TABLESPACE BEGIN BACKUP <name>, and it will take precautions:to freezes the modification of the SCN in datafiles header, and to write full data block the first 2 Session #492
  • 3. High Availability Boot Camp: RMAN - Eliminate the Mysterytime a record is modified. Archivelog should be enabled on the database to use this functionality.It is necessary (and very important) to run another command to indicate the end of the copy(ALTER TABLESPACE <name> END BACKUP), and thus the database starts to operate normally.With these actions, when using the restored backup files and open the database, the recoveryprocess identifies that the datafiles need recovery from the moment the backup started (as theSCN in the headers is less than the last known to the controlfile), so the recovery applies theredo generated that include the complete blocks, correcting any possible inconsistencies inexisting data files. For more details, see note 1050932.6 Why Are Datafiles Being Written ToDURING Hot Backup?This form of backup is known as hot backup, or inconsistent. This requires an additional task ofrecovery when used, therefore archived redologs generated until the end of the backup isneeded in addition.The following is a summary of the commands used in the two manual backup strategies (alsoknown as user managed): consistent (cold) inconsistent (hot) cp /u02/oradata/* dest-bkp alter tablespace nnn begin backup; cp /u02/oradata/datafile-nnn.dbf destbkp alter tablespace nnn end backup; cp /archivelogs-path/* dest-bkpRMAN BACKUPSRMAN is invoked by the command line and it runs as a console waiting to execute commands.The database used to perform backup and recovery is called "target". It is necessary to connectbefore to start executing commands, indicating the command line parameter "target", or usingthe "connect" command. It can be used a database name configured using SQL*NET, or using "/"to connect to the local database. The user must have sysdba privilege, so SYS is usually used.RMAN stores metadata about the operations performed in a repository itself, along withconfiguration variables in the controlfile of the target database or in a proprietary databasecalled the catalog.To take a backup we use the BACKUP command. It can make identical physical copies (imagecopies) or BACKUP SETS, a proprietary format, somewhat like TAR that can span multiple files,that stores all data backed up in one or more files, called BACKUP PIECES. The latter is preferredbecause it creates smaller files, as explained later.There are several files in the database that are not required for recovery and which can berecreated without loss of information, therefore they are not copied. Also, other files that are notpart of the data stored in the database are not copied either.In short, these are not copied: • temporary files (tempfiles) and online redo logs • files from external tabes or bfiles • database binary files • database network configuration files 3 Session #492
  • 4. High Availability Boot Camp: RMAN - Eliminate the MysteryThe copied files are: • datafiles • controlfiles • archived logs • spfileUsing the prior examples for manual backups, with RMAN they are as follows: consistent (cold) inconsistent (hot): rman target / rman target / startup mount; backup database plus archivelog; backup database plus archivelog;The difference between the scripts used for cold / hot cases is that in the first one the databasemust be open in at least mount mode to run the backup command. And in the second it shouldbe in archivelog mode.Archivelog needs to be included in a cold backup?. It depends. If the database was downnormally (immediate), then it is not necessary to do recovery, and archivelogs are not necessary.If the database was closed with a "shutdown abort" command or the server was turned off, thenthey are required to do recovery when we attempt to open it. This is only possible if thedatabase is in archivelog mode, because RMAN does not allow to take a backup of a closeddatabase in an inconsistent state if it is in NOARCHIVELOG mode(http://docs.oracle.com/cd/E11882_01/backup.112/e10643/rcmsynta007.htm#CHDBIAHI).RMAN do not have the problem of fractured blocks while copying, because it knows the format ofthe blocks and its contents is validated when copying, retrying the reading of those detectedinconsistent. Then, the database does not need to store additional redo while taking the backup.CONCEPTSNow that we know how to make a backup, we must know the concepts used by RMAN:• device: destination of the backups. It can be disk or tape (SBT).• channels: is a connection to the database used to read data and write it to the destination device. The default value is one. Several can be used to achieve parallelism.• Image copy: and identical copy of a single data file, archived redo log file, or control file.• backup sets: an Oracle proprietary way of making backups, which includes all the backed up data in a TAR like format, having an internal table of contents, can span several files and loads multiple external files into one archive. It is identified by a unique number, a date and a label (TAG). It is the only format allowed to tape (SBT).• backup pieces: are the files that make up a backupset. 4 Session #492
  • 5. High Availability Boot Camp: RMAN - Eliminate the Mystery• backup status: Indicates whether the backup is available for use (Available) or if the file no longer exists in the destination (Expired). The latter is detected by manually running the command Crosscheck, which is not doing automatically.• obsolete backups: those not needed to meet our retention policy. They are not deleted automatically. The commands REPORT OBSOLETE is used to see which are the files, and DELETE OBSOLETE to delete them from destination.• incarnation: is a number associated with the instance, which changes every time it is open with resetlogs. It is important for the restore procedures, because the backups are useful only in the same line of incarnations that have the current database.CONFIGURATIONThe BACKUP command can change its behavior with configuration variables or parameters thatimplement extra functionality. These variables define the behavior of some commands, ordefaults values when not explicitly stated.To view these settings, there is the command "show all". These belong to the destinationdatabase to which we are connected, and they are stored in the RMAN metadata. oracle@oraculo:~> rman target / Recovery Manager: Release 11.2.0.2.0 - Production on Thu Feb 9 16:16:00 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. connected to target database: XE (DBID=2642191927) RMAN> show all; using target database control file instead of recovery catalog RMAN configuration parameters for database with db_unique_name XE are: CONFIGURE RETENTION POLICY TO REDUNDANCY 2; CONFIGURE BACKUP OPTIMIZATION ON; CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default CONFIGURE CONTROLFILE AUTOBACKUP OFF; CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO %F; # default CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE MAXSETSIZE TO UNLIMITED; # default CONFIGURE ENCRYPTION FOR DATABASE OFF; # default CONFIGURE ENCRYPTION ALGORITHM AES128; # default CONFIGURE COMPRESSION ALGORITHM BASIC AS OF RELEASE DEFAULT OPTIMIZE FOR LOADTRUE ; # default CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default CONFIGURE SNAPSHOT CONTROLFILE NAME TO/u01/app/oracle/product/11.2.0/xe/dbs/snapcf_XE.f; # defaultThe variables that we must set in order to get our backup policy are:• Device: we will use disk or tape (SBT)? The default installation uses disk. To use tape you need to install and configure the use of a library provided by tape vendor, called media library (MML). 5 Session #492
  • 6. High Availability Boot Camp: RMAN - Eliminate the Mystery• Parallelism: using multiple channels simultaneously for backup or restore of the base. Is defined at the DEVICE. Available in Enterprise.• Retention Policy: For how long we want our backs available on disk?. This is controlled by setting the RETENTION POLICY, which can be defined by RECOVERY WINDOW or REDUNDANCY. The first is days during which backups must be kept, regardless of amount. The second quantity is only backrests.• Backup optimization: Do not copy existing files identical on destination. It is useful to shorten time in operations that failed due to lack of space and can reuse the good copies when retried (eg duplication)• Encryption: to seamlessly protect generated backups against third parties. Requires Advanced Security option for disk devices, or Oracle Secure Backup for tape devices.Another important configuration setting is outside RMAN.One is related to the use of the Fast Recovery Area (FRA). It is controlled by parameters in thetarget database: DB_RECOVERY_FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST.With the first one we can put a logical maximum to the disk space used by files inside the FRA.This is important because archivelogs are automatically deleted when obsoleted by RMAN, so weonly need to take care on proper setting its initial value, which must be big enough to allow allthe files needed by our retention policy.The other important parameter to consider when not using a recovery catalog isCONTROL_FILE_RECORD_KEEP_TIME. It defines the amount of days control file records arereused. Its default value is 7. This must be set consistent to our backup policy, to avoid losebackup records.SEEING ITS CONTENTSTo see the generated files and their contents, use the "List backup" command. In this example,as we are not using a recovery Catalog, this information is stored in the controlfile.Adding the clause "summary" you can see a summary rather than detail. oracle@oraculo:~> rman Recovery Manager: Release 11.2.0.2.0 - Production on Fri Feb 17 11:00:13 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. RMAN> connect target connected to target database: ENT11G (DBID=410442782) RMAN> list backup summary; List of Backups =============== Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag ------- -- -- - ----------- --------------- ------- ------- ---------- --- 1 B F A DISK 13-FEB-12 1 1 NO TAG20120213T122058 2 B F A DISK 13-FEB-12 1 1 NO TAG20120213T122238 3 B F A DISK 13-FEB-12 1 1 NO TAG20120213T122421 4 B F A DISK 13-FEB-12 1 1 NO TAG20120213T122440To view the full content of a specific backup (to avoid seeing all cataloged backups), we can filterby backup attributes, like its label (TAG): 6 Session #492
  • 7. High Availability Boot Camp: RMAN - Eliminate the Mystery RMAN> list backup tag TAG20120213T122058; using target database control file instead of recovery catalog List of Backup Sets =================== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 1 Full 1.03M DISK 00:00:01 13-FEB-12 BP Key: 1 Status: AVAILABLE Compressed: NO Tag: TAG20120213T122058 Piece Name:/u01/app/oracle/fast_recovery_area/ENT11G/backupset/2012_02_13/o1_mf_nnndf_TAG20120213T122058_7ml72cnz_.bkp List of Datafiles in backup set 1 File LV Type Ckp SCN Ckp Time Name ---- -- ---- ---------- --------- ---- 5 Full 1044143 13-FEB-12 /u02/oradata/ent11g/prueba.dbfIn this example you can see that it is a backup of a single datafile (prueba.dbf) and was stored inthe default destination (the Fast Recovery Area). This can be changed by adding the clauseformat /path-to-bkp to the backup command.NOTE: Dates displayed by these commands, as well as used as parameters to some commandsin RMAN can be changed with the NLS_DATE_FORMAT environment variable at the operatingsystem. For example to include the time: [oracle@oraculo ~]$ export NLS_DATE_FORMAT=DD/MON/YYYY HH24:MI:SS [oracle@oraculo ~]$ rman target / Recovery Manager: Release 10.2.0.3.0 - Production on Wed Dec 21 20:31:46 2011 Copyright (c) 1982, 2005, Oracle. All rights reserved. connected to target database: Ent11g (DBID=943234298) RMAN> list backup summary tag bkp_prod_121511060003; using target database control file instead of recovery catalog List of Backups =============== Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag ------- -- -- - ----------- -------------------- ------- ------- ---------- --- 1 B F A DISK 13/FEB/2012 03:21:03 1 1 NO TAG20120213T122058 2 B F A DISK 13/FEB/2012 03:25:42 1 1 NO TAG20120213T122238 3 B F A DISK 13/FEB/2012 03:29:28 1 1 NO TAG20120213T122421 4 B F A DISK 13/FEB/2012 03:35:17 1 1 NO TAG20120213T122440BACKUPS WITH ORACLE XEIf you use Oracle XE, the installation includes scripts to backup and restore the database usingRMAN: backup.sh and restore.sh. They are in the $ORACLE_HOME/config /scripts/ directory. Theyare a good starting point to take as an example and customize as you wish.INCREMENTAL BACKUPSIn a database with several terabytes of data, a full backup can take longer that the timeavailable for maintenance tasks. With incremental backups this time can be reduced by copyingonly the changed blocks since the last incremental backup.There are two types of incremental backup: differential and cumulative. 7 Session #492
  • 8. High Availability Boot Camp: RMAN - Eliminate the Mystery• Differential: copy only the changes since the last incremental backup• Cumulative: copy all changes from the last full backup (level 0).The level concept is similar to UNIX OS backup levels, but limited in the practice to 0 and 1,instead of the 0-5 for OS. This simply provides a marker for which to reach back for theincrement or difference.Therefore, to restore a database all differential backups are needed since the last full backup, orthe last cumulative backup following the last full backup. Incremental backups are differential bydefault.Example: the first incremental backup must be complete, and is identified with the level 0. backup incremental level 0 tablespace users;The next is level 1. If no level 0 backup exists when running a level 1, it creates a level 0 backup: backup incremental level 1 tablespace users;To take a cumulative backup: backup incremental level 1 cumulative tablespace users;If you have Enterprise version, you can enable the feature "Change Block Tracking" for this typeof backup. It keeps a bitmap of changed blocks of the database since it was enabled, avoiding toanalyze all the data blocks of the database each time the incremental backup is taken. This ismore efficient in scenarios where the total amount of changes are small compared to the size ofthe database, so it needs to be evaluated to your specific environment. And is the way to handlebigfile tablespaces.As an example, using Oracle Managed Files (DB_CREATE_FILE_DEST parameter), it can beenabled with this command, and then we can take the level 0 backup: ALTER DATABASE ENABLE BLOCK CHANGE TRACKING;Additionally, there are Incremental updated Backups, or merged incremental backups. Thisallows you to take an incremental backup and apply it to the last full backup for a new fullbackup, without reading all the destination database to create the full backup. The command totake it is: run { backup incremental level 1 for recover of copy with tag BKP_L0 database; recover copy of database with tag BKP_L0; }This type of backup requires more disk space, because it creates a new backup file in the defaultdestination. In the above example, only one copy is kept current at level 0, so a policy withredundancy 1 is used. If we want to use a recovery window with more available copies, it is 8 Session #492
  • 9. High Availability Boot Camp: RMAN - Eliminate the Mysteryneeded to add the clause UNTIL TIME .. to the recover command. For more details on this typeof backup, see the Note "745798.1 Merged Incremental Backup Strategies".COMPRESSIONThere are several features in RMAN which helps in having more compact files.The following prevents copying unnecessary blocks. They are built in and no configuration isrequired to activate them:• BLOCK COMPRESSION NULL (8i) - does not backup empty blocks never used (ie: unformatted above High Water Mark (HWM))• UNUSED BLOCK COMPRESSION (10.2) - does not backup unused blocks (ie: empty under HWM)• UNDO OPTIMIZATION (11.1) - does not backup undo segments from commited transactions (considering UNDO_RETENTION).A simple example to see the first optimization in action (NULL COMPRESSION) is to create atablespace and then make a backup: oracle@oraculo:~> du -hs oradata/XE 1.5G .oradata/XE oracle@oraculo:~> ls -lrt/usr/lib/oracle/xe/app/oracle/flash_recovery_area/XE/backupset/2010_06_03/ total 1176688 -rw-r----- 1 oracle dba 1203748864 2010-06-03 00:57o1_mf_nnndf_TAG20100603T005534_60g9xpkz_.bkpThe free space on the database is: 01:41:16 XE>select sum(bytes)/1024/1024 mb from dba_free_space; MB ---------- 135.8125Considering that RMAN does not include tempfiles nor redologs, 1.1Gb data were copied withRMAN against 1.5Gb which they had copied manually.The last feature, binary compression, must be explicitly enabled:• Binary COMPRESSION (10g) - Compress data before sending it to the destination. It supports incremental backups. Algorithm can be changed using the Advanced compression option (11.2) 9 Session #492
  • 10. High Availability Boot Camp: RMAN - Eliminate the MysteryThere are two ways to take compressed backups: set it as the default method, or explicitlystated in the BACKUP command. a) Backup as compressed backupset … b) CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO COMPRESSED BACKUPSET;The generated files are much smaller (from 1.1G to 200M), at the expense of using more CPU: -rw-r----- 1 oracle dba 1203986432 2010-06-03 01:31 o1_mf_nnndf_TAG20100603T013020_60gcywnl_.bkp -rw-r----- 1 oracle dba 235642880 2010-06-03 01:38 o1_mf_nnndf_TAG20100603T013720_60gdd066_.bkpCOMPRESSION WITH ORACLEXEThe script backup.sh does not have a parameter to create compressed backups. Also, if weconfigured compression as our database default method (configure device...) It wont be usedbecause the script includes clauses which take precedence.This choice can be justified because the maximum amount of data handled by XE is 11GB, andwith actual disk this size is not a problem, nor time required to manipulate this volume.Furthermore, the same script backup.sh sets redundancy to 2. This ensures that we always havea valid backup available, but not more than two. Then compression is even less necessary.Both can be adjusted according to our policy, simply by modifying this script.As an example, we take a backup without compression and with the compression setting: a) rman target / CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO BACKUPSET; exit; ./backup.sh b) rman target / CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO COMPRESSED BACKUPSET; exit ./backup.shOnce they finish, both generated files are identical: ls -lrt /usr/lib/oracle/xe/app/oracle/flash_recovery_area/XE/backupset/2010_06_03/ 10 Session #492
  • 11. High Availability Boot Camp: RMAN - Eliminate the Mystery -rw-r----- 1 oracle dba 1203748864 2010-06-03 00:57 o1_mf_nnndf_TAG20100603T005534_60g9xpkz_.bkp -rw-r----- 1 oracle dba 1203986432 2010-06-03 01:31 o1_mf_nnndf_TAG20100603T013020_60gcywnl_.bkpReviewing the content of the backup.sh script, we found: echo "Backup in progress..." rman target / >> $rman_backup << EOF set echo on; shutdown immediate; startup mount; configure retention policy to redundancy 2; configure controlfile autobackup format for device type disk clear; configure controlfile autobackup on; sql "create pfile=$rman_spfile2init from spfile"; backup as backupset device type disk database; configure controlfile autobackup off; alter database open; delete noprompt obsolete; EOFThe backup command explicitly includes parameters that indicate how to take backup, so it doesnot consider the default values. If we are interested on taking compressed backups, we mustmodify the code of the backup.sh script and include this explicitly: ... backup as compressed backupset device type disk database; ...INTEGRITY VALIDATIONBy default, the BACKUP command validates block checksum when writing to destination(physical integrity). Additionally, there are commands to validate logical integrity when takingthe backup, or directly on the data without the need for a backup, or over already taken backupfiles.To detect logical corruption in data blocks while taking the backup, the "check logical" clausemust be included : backup blocks all check logical database;To validate already taken backups, and their availability for the restore operation, the “validate”clause must be included: 11 Session #492
  • 12. High Availability Boot Camp: RMAN - Eliminate the Mystery restore validate database; restore validate controlfile to c:tempcontrol01.ctl; restore validate archivelog from sequence N1 until sequence N2;Since version 11.1 we have the command VALIDATE,so this can be done independently of thetasks of backup / recovery, with a variety of options that allows for example: validaing the entiredatabase, a specific backup, a datafile, and only a few blocks from a datafile: validate database; VALIDATE BACKUPSET 5; validate datafile 3; validate datafile 3 BLOCK 5 TO 20;Also we can optimize validations executing it on parallel, distributing reading task over severalchannels, taking advantage of the new parameter in 11.1 SECTION SIZE, which enables to splitthe reading of a single datafile in smalls chunks: RUN { ALLOCATE CHANNEL n1 DEVICE TYPE DISK; ALLOCATE CHANNEL n2 DEVICE TYPE DISK; VALIDATE DATAFILE 3 SECTION SIZE 1024M; }RECOVERYThere are several critical components in the database, and in case of isolated or combinedfailures, there are many different scenarios and steps needed to recover the database to aconsistent state. In the official documentation there are 2 basic and 12 advanced cases.To get just an overview on the subject, included here are only basic cases:• recover the last full backup, using the existing controlfile: RMAN> STARTUP MOUNT; RMAN> RESTORE DATABASE; RMAN> RECOVER DATABASE; RMAN> ALTER DATABASE OPEN;• recover only one tablespace in an open instance: RMAN> SQL ALTER TABLESPACE mytbs OFFLINE IMMEDIATE; RMAN> RESTORE TABLESPACE mytbs; RMAN> RECOVER TABLESPACE mytbs; RMAN> SQL ALTER TABLESPACE mytbs ONLINE;It is important to note that:• recovery uses backups known by the instance (cataloged), leaving in our hands ensuring consistency with what we have available. This involves the tasks of obtaining appropriate 12 Session #492
  • 13. High Availability Boot Camp: RMAN - Eliminate the Mystery backups from our archived media (tapes, DVD, etc), use the right controlfile for the set of backups to be used, and catalog them in the destination database (either controlfile or catalog).• backups are useful only in the same line of incarnations known by the current database, so the controlfile to use is critical.• recovery is to a point in time, so we must have the appropriate sequence of full backups and archivelogs (or incremental backups) to reach to the required point. In high-volume databases, to detect that this is not working may take several hours, so it is imperative to have a periodic test of these procedures, to anticipate any failures at a time where we have time to analyze and correct them without pressure.¿WHAT TO DO IF I STILL DO NOT USE RMAN?After seeing the advantages of RMAN and the time we can save by adding it to our backuppolicy, we have to plan the steps to get started.The first is to define the policy, including the following:• acceptable level of service: • time frame for implementation and execution • disk consumption and CPU usage expected to define the use of compression• retention• destination• generated file names• use of parallelism• maximum sizes of files generated• use of encryptionWith these definitions, the RMAN environment must be configured at each database. A script canbe created to perform this configuration, and although it will be used only the first time RMAN beused by our database, it will serve as a backup of the configuration and will be part of theprocess of recreating the database on a new server .Then it should implement the scripts that run the backup. They should incorporate controls to becomplete, and in addition to the statements that actually take the backup, should include:• Delete obsolete backups• List catalog (documentation)• Validate logical integrity• Detect errors in the execution of the script and notify operatorsOnce the backups are working, we must remember that you must practice the various recoveryscenarios, and run backups regularly taken to validate that there are no errors in the process, oron the destinations where they reside (tapes, CD, external drives, etc..). 13 Session #492
  • 14. High Availability Boot Camp: RMAN - Eliminate the MysteryNOTE: remember that the RMAN-generated files are not portable between platforms, so a backuptaken on Intel 32-bit (x86) can not be restored on Intel 64-bit (x86_64). There is an RMANcommand , CONVERT DATABASE, which works only over platforms which share the endianformat. Details of this on “Transporting Data Across Platfroms”,http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmxplat.htm#CHDFHBFIADITIONAL TASKSMONITORINGTo validate the progress of a backup operation, the view V$SESSION_LONGOPS may be usedwhile the operation is running, with a query like this: SELECT SID, SERIAL#, CONTEXT, SOFAR, TOTALWORK, ROUND(SOFAR/TOTALWORK*100,2) "%_COMPLETE" FROM V$SESSION_LONGOPS WHERE OPNAME LIKE RMAN% AND OPNAME NOT LIKE %aggregate% AND TOTALWORK != 0 AND SOFAR <> TOTALWORK;To see the result of executed commands in RMAN since the last start of the base, the viewV$RMAN_OUTPUT may be used, which exists only in memory.To view the status of the tasks performed or in progress, there is the view V$RMAN_STATUS.To view the backups that have been taken on the current database, we may use the views: • V$BACKUP_SET • V$BACKUP_SET_DETAILS • v$BACKUP_DATAFILE • V$RMAN_BACKUP_JOB_DETAILSIf using a catalog, then there is more metadata available in RC_* views.Finally, after running the command VALIDATE, if bad blocks are detected they appear in thesystem view V$DATABASE_BLOCK_CORRUPTION.MANTEINANCEOur backup scripts should include crosscheck statements to detect backup files deleted outsideRMAN, and statements to periodically delete obsolete backups.Both statements may be included in our backup script daily or weekly, or run separately.In the case of having a standby database, from version 11.1 the deletion of obsolete can be done 14 Session #492
  • 15. High Availability Boot Camp: RMAN - Eliminate the Mysteryimmediately to the back, having set the ARCHIVELOG DELETION POLICY TO APPLIED ONSTANDBY. In previous versions, the deletion of archivelogs must be separated from the backupto ensure that archivelogs can be sent to the standby before being deleted.if using the Fast Recovery Area, archivelogs are automatically deleted when obsoleted by theretention policy. Also there are disk quota rules for deletion when free space is needed, who candecide to delete files which are not obsoleted but has been backed up. For more details aboutthis, http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmcncpt.htmWHEN ERRORSThe most common errors with RMAN are caused by recovery procedures using incorrect backupfiles or controlfiles, such as: RMAN-03002: failure of restore command at 12/20/2011 08:08:22 RMAN-06026: some targets not found - aborting restore RMAN-06023: no backup or copy of datafile 17 found to restoreIn case of errors that put health of our procedures in doubt, the first thing to do is to validatethat we are doing the right thing by checking the official documentation (http://otn.oracle.com)and My Oracle Support (http://support.oracle. com).If you already did that and want to analyze the steps RMAN took to get the error, then you canenable DEBUG in backup / restore commands, and use TRACE to see the channels activity: rman target / log rman.log trace rman.trc run{ allocate channel t1 type sbt………trace=2; allocate channel t2 type sbt………trace=2; allocate channel t3 type sbt………trace=2; debug on; restore database; debug off; }Thus we see all the internal operations performed by RMAN that can serve as a guide to trackthe problem.To address problems with the use of tape devices, it must be remembered that Media managerlibraries (MML) are from third parties, and therefore can not be controlled with RMAN commands,We must review the documentation provided by the manufacturer.Enabling trace on the channel generates a file called sbtio.log, with information generated by theMML. This must be validated with the manufacturer.Another alternative is to use the simulation driver provided by Oracle, DISKSBT: run { 15 Session #492
  • 16. High Availability Boot Camp: RMAN - Eliminate the Mystery allocate channel t1 type sbt parms SBT_LIBRARY=oracle.disksbt,ENV=(BACKUP_DIR=d:temp) trace=2; backup database; }This way it will rule out that the problem is in the read operation or at our instalation, thus wecan advance in the diagnosis and if necessary pass the problem to the manufacturer of the tapedrive.OPTIMIZATIONServer running RMAN tasks increases mainly its I/O activity, so we should look to optimize theaccess time to devices.The points to check are:• using async I/O (O.S. configuration)• adequate parallelism (channels) to amount of tapes slots of the hardware• performance of the MML• identify whether the problem is in reading or writing. • for example, comparing the time of "backup validate" (read-only operation) with the time of backup.• if using incremental backups and Enterprise Edition, enable block change tracking.• to know the RMAN process architecture in detail helps in the optimization process. The MOS note 360443.1 "RMAN Backup Performance" goes deeper on it.• to use the standby database if Active DataGuard is available.Since version 11.1 SECTION SIZE parameter exists, which allows to generate a backup piece of asingle datafile using smaller chunks (sections) in parallel, thus produce it in less time.For example, to back up a tablespace with a datafile of 900Mb using three parallel channels: CONFIGURE DEVICE TYPE sbt PARALLELISM 3; CONFIGURE DEFAULT DEVICE TYPE TO sbt; RUN { BACKUP SECTION SIZE 300M TABLESPACE prueba; }For recovery operations, we must remember that the content of the scripts are executed serially.For example, if we want to leverage that we have three tape drives and want to access them inparallel, this sequence does not succeed: run { allocate channel t1 type sbt....; allocate channel t2 type sbt....; allocate channel t3 type sbt....; restore datafile 2; restore datafile 3; 16 Session #492
  • 17. High Availability Boot Camp: RMAN - Eliminate the Mystery restore datafile 5; restore datafile 7; restore datafile 11; }Since the smallest unit of reading up to 11.1 was the datafile, the way of using channels inparallel is: run { allocate channel t1 type sbt....; allocate channel t2 type sbt....; allocate channel t3 type sbt....; restore datafile 2,3,5,7,11; }BACKUP REPOSITORY (CATALOG) EXTERNALWe can configure a database which is used exclusively as a repository of all informationmaintained by RMAN. If you manage a single database, this can add a task management too. Butif you work with many, it gives us the possibility to have this information centrally.FEATURES AVAILABLE ON ENTERPRISE EDITION ONLYRemember that the following features are not available in Standard Edition:• allocate disk parallel channels• block change tracking• encryption -- Advanced Security option or Oracle Secure Backup licenseEXAMPLESHere are some examples of using RMAN for specific tasks:CLONINGRman has the DUPLICATE command wich automates the process of creating a clone of adatabase. Since version 11.2 there are many ways to clone a database: from active database,and from pre-existing backups. This last one also could be connected or not to a target database,and to a catalog. All the details about this can be found in the manual:http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmdupdb.htmAs an example, we will see how to duplicate a database to a remote host with the same directorystructure, in 10g and in 11.2, assuming Oracle Net Connectivity has been already configured onthe new server to the source database. 17 Session #492
  • 18. High Availability Boot Camp: RMAN - Eliminate the Mystery10G CLONINGThese steps must be performed to implement a clone that can be automated without manualintervention. Remember that the database created by cloning is done over the AUXILIARconnection, and the source database is the RMAN TARGET connection.1. set destination database environment: parameter file, password file and directories2. reboot the destination database in nomount mode3. get the SCN to which you want to restore from the source database. There are several ways to do it. One possible is to look past SCN backed archivelogs with the following query: select next_change# from v$archived_log where recid = (select max(recid) from v$archived_log where backup_count>0);4. execute the duplication on destination server: rman catalog rman/clave@rman target sys/clave@origen connect auxiliary / run { allocate auxiliary channel dupdb1 type disk; set until scn $MAX_SCN; duplicate target database to COPIA NOFILENAMECHECK; }5. disable archivelog on the cloned databaseONLINE CLONING (> 11.1)Same 1 and 2 steps, and the 4th is: rman nocatalog target sys/clave@origen connect AUXILIARY sys/clave@copia run { DUPLICATE TARGET DATABASE TO COPIA FROM ACTIVE DATABASE PASSWORD FILE SPFILE NOFILENAMECHECK SET SGA_TARGET=1000M; } exit; 18 Session #492
  • 19. High Availability Boot Camp: RMAN - Eliminate the MysteryCLONING A RAC DATABASECloning a database in a RAC environment is similar to the single instance procedure, with thefollowing changes:• pfile on destination database is configured as single instance (removing RAC configuration)• clone the same way as single instance• pfile parameters are adjusted, adding RAC settings removed before, and putting the correct names for the control_files and the new database.• restart the database to use the new parameters• configure the new database in all the remaining RAC nodes: pfile, pwfile, tnsnames.ora• once cloned, you must register the new database in CRSDetails of this procedure can be found in My Oracle Support notes 461479.1 and 452868.1DATA RECOVERY ADVISORThis functionality, available since version 11.1, analyze failures on the database andrecommends actions to correct them, having the possibility to execute the recommended scripts.In this example we simulate the loss of a datafile: oracle@test:> rm /u01/app/oracle/oradata/test11/users01.dbf oracle@test:> rman target / Recovery Manager: Release 11.2.0.1.0 - Production on Thu Jun 3 14:07:35 2010 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. connected to target database: TEST11 (DBID=3428713062) RMAN> validate database; Starting validate at 03-JUN-10 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=68 device type=DISK RMAN-06169: could not read file header for datafile 4 error reason 5 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of validate command at 06/03/2010 14:07:50 RMAN-06056: could not access datafile 4 RMAN> list failure; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- ------------- ------- 19 Session #492
  • 20. High Availability Boot Camp: RMAN - Eliminate the Mystery 122 HIGH OPEN 03-JUN-10 One or more non-system datafiles are missing RMAN> advise failure all; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- ------------- ------- 122 HIGH OPEN 03-JUN-10 One or more non-system datafiles are missing analyzing automatic repair options; this may take some time using channel ORA_DISK_1 analyzing automatic repair options complete Mandatory Manual Actions ======================== no manual actions available Optional Manual Actions ======================= 1. If file /u01/app/oracle/oradata/test11/users01.dbf was unintentionally renamed or moved, restore it Automated Repair Options ======================== Option Repair Description ------ ------------------ 1 Restore and recover datafile 4 Strategy: The repair includes complete media recovery with no data loss Repair script: /u01/app/oracle/diag/rdbms/test11/test11/hm/reco_2784754173.hmThe repair script has the following code: # restore and recover datafile sql alter database datafile 4 offline; restore datafile 4; recover datafile 4; sql alter database datafile 4 online;These steps are correct because we have a cataloged backup on our database, and available.If this were not the case, the recommendation would be different: analyzing automatic repair options; this may take some time using channel ORA_DISK_1 analyzing automatic repair options complete Mandatory Manual Actions ======================== 20 Session #492
  • 21. High Availability Boot Camp: RMAN - Eliminate the Mystery 1. If file /u01/app/oracle/oradata/test11/users01.dbf was unintentionally renamed or moved, restore it 2. If you have an export of tablespace USERS, then drop and re-create the tablespace and import the data. 3. Contact Oracle Support Services if the preceding recommendations cannot be used, or if they do not fix the failures selected for repair Optional Manual Actions ======================= no manual actions available Automated Repair Options ======================== no automatic repair options availableBack to the first case, having the backup known to the catalog, we accept to execute therecommendation: RMAN> repair failure; Strategy: The repair includes complete media recovery with no data loss Repair script: /u01/app/oracle/diag/rdbms/test11/test11/hm/reco_3604648805.hm contents of repair script: # restore and recover datafile sql alter database datafile 4 offline; restore datafile 4; recover datafile 4; sql alter database datafile 4 online; Do you really want to execute the above repair (enter YES or NO)? Yes executing repair script sql statement: alter database datafile 4 offline Starting restore at 03-JUN-10 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: specifying datafile(s) to restore from backup set channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/test11/users01.dbf channel ORA_DISK_1: reading from backup piece /u01/app/oracle/flash_recovery_area/TEST11/backupset/2010_06_03/o1_mf_nnndf_TAG2010 0603T140513_60hr69ob_.bkp channel ORA_DISK_1: piece handle=/u01/app/oracle/flash_recovery_area/TEST11/backupset/2010_06_03/o1_mf_nnndf_ TAG20100603T140513_60hr69ob_.bkp tag=TAG20100603T140513 channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 Finished restore at 03-JUN-10 Starting recover at 03-JUN-10 using channel ORA_DISK_1 21 Session #492
  • 22. High Availability Boot Camp: RMAN - Eliminate the Mystery starting media recovery media recovery complete, elapsed time: 00:00:00 Finished recover at 03-JUN-10 sql statement: alter database datafile 4 online repair failure completeNEXT STEPS?This paper provides a guide to start using RMAN. There are many areas to expand. The followingare some suggestions to continue:• Exercise recovery scenarios separating roles of DBAs with one group generating failures and the other trying to fix them.• Using Enterprise Manager (EM). Many common operations with RMAN have their own screens inside EM (ie: cloning, backup), so it is recommended to move as much as possible of our backup policy to EM to take advantage of its reporting mechanisms, alerting and centralized management.• Learn and incorporate additional high availability functionalities included on Oracle databases, such as Dataguard and Flashback.REFERENCESOracle® Database Backup and Recovery Basics - 10.2http://download.oracle.com/docs/cd/B19306_01/backup.102/b14192/toc.htmOracle® Database Backup and Recovery Advanced Users Guide - 10.2http://download.oracle.com/docs/cd/B19306_01/backup.102/b14191/toc.htmOracle® Database Backup and Recovery Users Guide – 11.2http://download.oracle.com/docs/cd/E11882_01/backup.112/e10642/toc.htmWhats New in Backup and Recovery?11.1 - http://docs.oracle.com/cd/B28359_01/backup.111/b28270/wnbradv.htm11.2 - http://docs.oracle.com/cd/E24693_01/backup.11203/e10642/wnbradv.htmMOS note 360443.1 - RMAN Backup PerformanceMOS note 740911.1 - RMAN Restore PerformanceMOS note 1116484.1 - Oracle Support Master Note For Oracle Recovery Manager (RMAN) 22 Session #492