Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
334
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
14
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • How many of you have mission-critical systems? How many of you have e-business applications? Did you know that industry analysts estimate that each minute of database downtime could cost your business up to $96K? That’s $6M/hour. Unfortunately, unplanned database downtime is a fact of life. And 80% of unplanned downtime is caused by software or human error. McGladrey and Pullen estimates that 50% of all businesses who experience a critical system outage of 10 days or more will NEVER recover. In those 10 days, the average business loses 2 - 3% of its annual revenue. For every 8 hours of outage, the average business loses 1/2 point of market share, which will take a full 3 years to recover. Database downtime can also result in lost customers. If you lose a customer, it costs 14 times your initial investment to get them back. If you CAN get them back. And industry research shows that the average DBA has less than a year of experience. So Are you ready to recover from unplanned downtime? Are you confident you can recover from any type of outage? Do you know how to minimize your time to recovery?
  • The native DB2 recover utility always processes log forward when performing a point-in-time recovery. This means that, in almost every case, either image copies or a pack backup must provide the basis for the log processing. Today, indexes are generally rebuilt from the tablespace data. Any point-in-time recovery from image copies reads and writes every page of every space involved, even if the updates to be eliminated during the recovery only affected a few pages.
  • BMC Software has a long and glorious history of providing high-value software in the area of Backup and Recovery. The ‘point’ products are mature, robust, rich in features and functions, well supported, and in production use in most of the Fortune 500 sites. As the complexity of I/T environments have grown, and staff resources have dwindled, it has become important to add more ‘smarts’ to the product offerings. BMC has gathered functionality of our point products and integrated them into a complete DB2 Recovery solution. The integration allows us to leverage the strength of our point products while exploiting the capability of assuming certain functions are available. For instance, when Recovery Management for DB2 is asked to recover a set of tablespaces to a particular Point in Time, it can assume that the (incredibly fast) Physical Backout facility is available and attempt to use that process, bypassing the overhead of an image copy/log forward recovery. Your application is available in minutes instead of hours. Recovery Management for DB2 V1.1 was released in July of 2002 with all the additional features you see here.
  • Part normal copy, part Instant Snapshot Copy One BMC Copy statement (DB2 Example follows) - COPY TABLESPACE DB.* SHRLEVEL CHANGE STACK CABINET OUTSIZE parm drives large objects to DSSNAP, small to cabinet copy Generated copy for 1828 data sets (198 Instant Snapshot, 1630 regular copy) 17 Minutes elapsed time (NO OUTAGE due to SHRLEVEL CHANGE) Very little CPU time Forward Recovery time for entire application – 49 minutes Without Instant Snapshot input, recover time 193 minutes with RECOVER PLUS for DB2 Without BMC, recover time with DSNUTILB over 360 minutes – 6 HOURS !! Note – trying Instant Snapshot for all 1828 objects 57 minutes elapsed time for copy 39 minutes for recovery
  • Additional benchmarks on one tablespace and 7002 indexspaces.  The tablespace has 178 pages and most of the indexspaces contain 18 pages.  These are FULL YES copies, RESETMOD NO.    NORMAL STACKED TAPE (single task) Elapsed time:  136.60 minutes    Cpu:  9.09           CABINET TAPE (single task) Elapsed time:  16.56            Cpu:  4.16                  CABINET DISK (single task) E lapsed time:   15.98            Cpu:  4.14            CABINET DISK (multi-task 10) Elapsed time:    8.54            Cpu:  4.29
  • Encrypted copies can be created by COPY+ when used as part of the RMD solution. Support for COPY+ encryption relies on a user-created and maintained data set, called the key data set. The key data set contains essential encryption key information. RECOVER+ requires the key data set to recover encrypted copies, while Log Master may require the key data set to read encrypted image copies to obtain compression dictionaries or to complete partially logged updates.  Because encrypted image copies are non-standard, encrypted image copies are registered in the BMCXCOPY table. An STYPE value of e indicates that the image copy is encrypted. In a recovery, you must use RECOVER+ with encrypted image copies.  As with SYSCOPY registration, BMCXCOPY registration includes a timestamp specifying when the copy was registered. RECOVER+ and Log Master use this timestamp to find the correct key value in the key data set. Encrypted copies do NOT use the zAPP engine. Encrypted copies do use a new zSeries machine instruction called KMC which usually has a negligible impact on elapsed time but does cause a slight increase in CPU time for COPY+ and RECOVER+. The increase depends on which encryption method you choose; it will be somewhere between 10% and 50% on COPY+. But since COPY+ is typically low on CPU and high on I/O, the CPU increase should be minimal.
  • Online Consistent Copy is a feature that allows you to make consistent copies of table spaces and index spaces without any outage . The spaces being copied are continuously available for read and write access while the copy is being made. Online Consistent Copy does this by making SHRLEVEL CHANGE Instant Snapshot copies. Then Online Consistent Copy uses the log to externalize changes to the copies for all complete transactions and to back out changes in the copies for any in-flight transactions. Later releases of the RMD solution (V7.3 and higher) can create Online Consistent Copy with normal copy process, eliminating the need for hardware that can create an Instant Snapshot.
  • DB2 Recovery Manager is a simple to use application. It runs from TSO, where the DBA is used to doing work. There are some DB2 tables used by Recovery Manager for repositories. The DBA can easily identify a set of tablespaces and/or indexes for recovery, and tailor the recovery to the particular event. This ‘grouping’ allows for great flexibility in associating objects. A group can be saved for later use (as in the case of a Disaster Recovery group), or can be discarded at the end of the recovery generation (like for a Volume failure). Recovery Manager will ensure the recoverability of all members of a recovery group before the jobstream is built, ensuring a complete and correct recovery (some customers define large groups of objects and execute this recovery validity check on a periodic basis to ensure all objects are being copied and logged correctly). Once members of a group are identified, the DBA can choose recovery points, utility options, and other parameters to generate exactly the recovery needed for this event. Groups can be shared, so one DBA can define a group for a particularly complex application, and rest assured that the ‘backup’ support can execute recovery in case of absence.
  • RECOVER PLUS can optionally make up to 4 registered image copies or DSN1COPYs simultaneously with recovery. The LOG INPUT phase reads the log and passes needed records to the system sort routine. The log records are sorted into optimum sequence (I.e. Tablespace, Page, RID). This allows the log records to be merged with the image copy pages more efficiently. MERGE phase merges the image copy data set(s) and the sorted log records. If you also specify RECOVER INDEX or RECOVER UNLOADKEYS, this phase extracts the index keys during the merge, thus eliminating an additional pass of the data to build the indexes. The INDEX information can be held in storage via the NOWORKDDN keyword, further reducing I/O and speeding the recovery.
  • R+/CHANGE ACCUM allows you to extract the records required for recovery of a specific tablespace (or group of tablespaces). The utility reads the DB2 log and filters out records unrelated to tablespace recovery (about 80% of the log). Of the 20% remaining, only those UNDO/REDO records for your identified object(s) are retained. The generated change accumulation file is very small. A repository keeps track of the file name, the objects accumulated there, and the log range supported by it (kind of a combination of SYSLGRNG/SYSLGRNX and SYSCOPY).
  • To avoid reading and writing every page, RECOVER PLUS supports a BACKOUT feature, giving it the ability to do a point-in-time recovery using undamaged spaces (including type 2 indexes) and backing out the changes from the log. This technique accomplishes point-in-time recoveries without using image copies or pack restores. No log is required prior to the point in time of the recovery. The spaces are merged with the sorted log records to avoid reading or writing pages more than once and pages without changes are simply left untouched. This powerful technique can accomplish point-in-time recovery many times faster than the traditional technique. With RECOVERY MANAGER for DB2 you can dynamically define a group of tablespaces and indexes based on either usage in a DB2 plan, a referential integrity data set or names identified with wildcards. You can then generate the BACKOUT RECOVERY JCL for the group.
  • The reason for a point-in-time recovery is to eliminate unwanted changes to an object. In traditional point-in-time recovery, there is a potential to apply many log records that had nothing at all to do with the changes to be eliminated. Image copy pages are read and log records are applied so that the vast majority of pages will look exactly as they did before the job. Only the pages updated by the log records eliminated are changed.
  • By using the BACKOUT strategy, you may have fewer log records to process. RECOVER PLUS only updates pages that require changes. Instead of mounting an image copy tape and reading through all the pages on the space, RECOVER PLUS only reads from DASD those pages that were really affected by the bad update.
  • Log Master is an ISPF based application. Using the task-oriented panel flows, you can quickly and easily specify the criteria for a particular event. You will always specify an Input, an Output, and a Filter. The Input can either be the DB2 Log (Active and/or Archive logs), or a Log Master Logical Log. The Logical Log is a previously built extract from the DB2 Log. For instance, you may set up a process to extract a broad range of information from the DB2 Log on a daily basis, and save it to a Logical Log dataset. Subsequently, you may want to report further on some event from the ‘past’, and the Logical Log can be your source. The Output can be any of several Reports, and/or some DDL/DML/DCL to be used for ‘playback’, and/or a Load file for populating another DB2 subsystem. You can create several outputs from one pass of the DB2 log. The Filter is your ‘WHERE’ clause. You use the Filter to subset or extract data from the DB2 log, much as you use SQL to create a result set from a DB2 table. The Filter has a component for Time Frame (the Begin and End points of your search), and predicate based on data in the Log Record Header area (about 14 fields in all).
  • Specific Apply targets we support Oracle DB2 UDB DB2 OS/390 DBCS support Log Master online interface supports high speed Apply feature Can Use Instant Snapshot for Row Completion Better SQL tuning options and separation of UOWs if required Log Master Logical Log Performance Benefit Over Native SQL by eliminating SQL Generation and Reparsing Quiet Point Analysis finds “natural quiet points” after the fact and will insert ‘Q’ records in the DB2 catalog to provide point of recovery. User can specify minimum quiet point required. Apply Plus, bundled with Log Master at no additional cost, adds to the Recovery and Migration message outperforming the EXECSQL processor of Log Master by: Multi-threaded Apply Many controls over thread distribution Robust Conflict Resolution Object Mapping Restartable Apply Plus is now invoked for Log Master’s EXECSQL, but limited to a single thread. You must run Apply Plus as a separate job step to realize the full benefits. Conflict Resolution using Apply Plus Conflict Section Defines RETRY behavior and DEFER file contents Parameters DataType – What data to include in the DEFER file RetryFail – Final Code action if RETRY fails RetryLimit – Time or Count – Unit RetryValue is specified RetryValue – Seconds if Time, Number of Retries if Count ConflictFile Section Defines physical attributes of the file(s) to contain statements DEFERred by conflict ACTIONs.
  • Quiet Point Report… Log Master finds quiet points by tracking units of recovery which have updates to the filtered objects. However, a physical quiet point does not necessarily represent a logical quiet point. Some logical transactions actually encompass several discrete units of recovery. The DURATION feature was added in V310 to address this issue. DURATION can be specified as a TIME value from 1 millisecond to to 24 hours. When DURATION is specified, any physical quiet points less than the specified time will be suppressed. Open Transactions Reports what URIDs were still active at the TO point of the logscan range.
  • The CSV/SDF format of the SUMMARY report can optionally include intermediate and final total counts. The performance benefits of the SUMMARY ALL ACTIVITY report will only be realized if is the only output type requested in a job step. You can have multiple ALL ACTIVITY reports requested… but, when an additional output type is requested, all standard Log Master overhead will be required.
  • Log Master can scan the DB2 log for changes made by a ‘bad’ transaction – which might mean everything associated with a particular PLAN, or executed by a certain USER, within a specified timeframe. The DB2 log records before and after images of all changes, and Log Master can use the ‘before’ images to generate what we call ‘UNDO’ SQL. If the bad transaction did an INSERT, we’ll do a DELETE. If the bad transaction changed ‘WEAVER’ to ‘BAKER’, we’ll change it back. The really cool part about this process is it’s all done while the table is available for updates. There’s absolutely NO OUTAGE required to remove the bad changes. An ‘online recovery’!
  • The Drop Recovery report provides summarized information about the DB2 objects you want to recover. The information includes the specific image copy that should be used for the recovery and OBID translation information. The generated JCL automates the process of getting your dropped objects back completely up-to-date at the time of the drop . Dropped tables present a special problem since the normal recovery ‘unit’ in DB2 is the table space. This feature provides the ability to use the table space recovery resources but not disturb other tables in the space. The Log Master technology provides interpretation of the log that resulted from the drop to find image copies, object IDs and to recreate the DDL. RECOVER PLUS technology provides the ability to recover with explicitly named image copies, translating the IDs appropriately and applying log records so no transactions are lost.
  • Data warehousing application typically replicate huge files on a periodic basis. Often, most of the file is unchanged since the last replication. Using Log Master Migration support, you can discover the changes to your ‘source’ DB2 tables, and capture the data in a Logical Log. The data can then be applied in either LOAD or SQL processes. If SQL is chosen, the Log Master Execute SQL engine can speed the dynamic SQL, and allow for error handling. The Log Master repository keeps track of the log range processed by the ‘last’ execution of Log Master, and picks up where it left off. Any open URIDs (or non-externalized pages) are captured by this subsequent run, so the target file always receives complete updates. The period between Log Master runs is set by you, so your target file can receive updates daily, or hourly, or as often as you choose. The output of Log Master is a Logical Log, which has the completed row for all modified data. We complete the row information without requiring data capture or image copy mounts. The format of the Logical Log is documented, and sample routines for common 4GL are provided.
  • Disaster Recovery support is always a balancing act between cost and complexity vs. data loss and recovery time. Some shops opt for the very simple (and efficient) approach of dumping all DASD volumes periodically (typically weekly). At the remote site, a full set of volume restores gets the entire data center ‘back’ to the same point in time. Unfortunately, executing the dumps requires an outage, and restoring at the remote site means losing data. (what if the disaster occurs on Friday, and your most recent dump was taken last Sunday?) Some companies can’t accept the outage for dumps, or the loss of data. There are techniques which allow for capturing DB2 log data, and transmitting it to a remote site. The log data can even be applied at the remote site, so the business is essentially shadowed. In the event of a disaster, this method ensures very little data loss (just the inflight transactions), and minimal recovery time. However, it is incredibly complicated and expensive. The compromise in most shops is to send a copy of all application and DB2 system object copies, along with copies of the DB2 log, to a remote vault. In the event of a disaster, recovery is to the end of the ‘last’ offsite Archive Log, which typically is from ‘last night’. This is a technique documented in the DB2 System Administration guide. It requires a complex set of procedures to support it. BMC Recovery Manager has automated this procedure.
  • BMC Software developed Recovery Manager for DB2 to address complex recoveries. In particular, the process of getting DB2 up to the last Archive Log has been made as simple as possible. Several utilities are provided with DB2 Recovery Manager to handle the data gathering and processing required for DB2 remote restart. The process has been used by dozens of companies in production. (In fact, a very effective trial technique is to install Recovery Manager a few weeks before the next DR test, and generate the support utilities. Then watch us get DB2 up at the remote site in record time!).
  • Disaster Recovery support is always a balancing act between cost and complexity vs. data loss and recovery time. Some shops opt for the very simple (and efficient) approach of dumping all DASD volumes periodically (typically weekly). At the remote site, a full set of volume restores gets the entire data center ‘back’ to the same point in time. Unfortunately, executing the dumps requires an outage, and restoring at the remote site means losing data. (what if the disaster occurs on Friday, and your most recent dump was taken last Sunday?) Some companies can’t accept the outage for dumps, or the loss of data. There are techniques which allow for capturing DB2 log data, and transmitting it to a remote site. The log data can even be applied at the remote site, so the business is essentially shadowed. In the event of a disaster, this method ensures very little data loss (just the inflight transactions), and minimal recovery time. However, it is incredibly complicated and expensive. The compromise in most shops is to send a copy of all application and DB2 system object copies, along with copies of the DB2 log, to a remote vault. In the event of a disaster, recovery is to the end of the ‘last’ offsite Archive Log, which typically is from ‘last night’. This is a technique documented in the DB2 System Administration guide. It requires a complex set of procedures to support it. BMC Recovery Manager has automated this procedure.

Transcript

  • 1. DB2 Recovery Solutions & More Bill Arledge DB2 Data Management Analyst BMC Software
  • 2. When Availability is Critical, Recovery is Crucial!
    • Unplanned downtime is an unfortunate fact of life...
    • Up to 80% of all unplanned downtime is caused by software or human error*
    • Up to 70% of recovery is “think time” !
    *Source: Gartner, “Aftermath: Disaster Recovery”, Vic Wheatman, September 21, 2001
  • 3. Recovery is a Real Challenge
    • Cost of Downtime varies
        • By Industry
        • By Business Cycle
    • Staff Productivity and Expertise pressures
        • Harder to get and keep good technicians
        • Recovery is a ‘part time’ job, skills may wane
        • A lot of hours can go into DR test ‘preparations’
    • Planned downtime (backups) pressures
        • Consistent Copies/Dumps require outage
        • Even a brief outage may impact business
    • Unplanned outages happen at painful times
  • 4. Recovery Elements FAILURE CREATE RECOVERY JCL EXECUTE RECOVERY ANALYSIS RECOVERY MANAGEMENT FAST UTILITIES APPLICATION OUTAGE
  • 5. Let’s think out of the BOX!
      • Who says the only way to recover to a PiT is forward recovery?
        • What if you could ‘avoid’ recovery for some objects in an application
        • What if you could exploit storage technology instead of tape copy?
        • What if you could go ‘backwards’ through the log?
        • What if you want to make the bad SQL just go away by ‘undoing’ it?
      • This presentation will show how to use BMC Software to reduce or eliminate the downtime for Backup, Recovery, Replication, and Batch Restart
    Apply 23 Hours of Log 1 Hr of Bad Transactions 23 Hours of Good Transactions Image Copy Recovery started Recovery Point
  • 6. Recovery Management for DB2 Solving Business Problems with Innovation and Automation
    • Solution Integration
      • Building on core components to leverage BMC technology
    • Backups – High availability techniques for necessary process
      • COPY PLUS Value Proposition
      • Snapshot Copy (Software, Hardware, Instant Snapshot)
      • Hybrid Copy Technique – mix and match for effective backup and recovery
      • Cabinet Copy – dramatic reduction in elapsed and CPU time
      • Encrypted Image copy – secure offsite tape storage
      • Online Consistent Copy – clean copy with NO Quiesce
    • Application Recovery – speed and automation for an infrequent event
      • Recovery Management interface
      • Innovative forward and PIT recovery techniques – Index automation, Backout, Backup & Recovery Avoidance, Timestamp Recovery, Log accumulation
      • Creative uses for the DB2 Log data – Reporting, UNDO, Migration
    • Disaster Recovery
      • Local site preparation automation
      • Remotes site System and Application Recovery automation
      • DR Reporting, Estimation, Simulation
      • Remote Recovery and Replication
  • 7. Recovery Management for DB2
    • Top ROI Features
      • Large Application Group (e.g. SAP) Backup and Recovery Automation*
      • Snapshot Copy
        • Software, Hardware*, Instant Snapshot*, Instant Restore*
      • Online Consistent Copy*
      • Index Copy and Recover Automation*
      • Physical Backout*
      • Backup and Recovery Avoidance*
      • Log Manipulation
        • Extract & store by filter, Reports (including Data change Audit), Logical UNDO
      • Dropped Object Recovery*
      • Disaster Recovery and Coordinated Recovery automation*
      • Disaster Recovery Reporting*
      • Recovery Time Estimation*
      • Recovery Simulation*
      • Timestamp Recovery*
      • DB2 Version 8 Exploitation (more later!)
  • 8. BMC Recovery Management for DB2 Solving Business Problems with Innovation and Automation Integrating the capabilities of mature, patented functions. Recovery Management for DB2 Intelligent * Integrated * Automated * Optimized SNAPSHOT UPGRADE FEATURE ® The Sum is greater than the parts
    • Solution Exclusives
    • Recovery Simulation
    • Recovery Estimation
    • Disaster Recovery Tracking and Reporting
    • Backout to Forward Recovery Automation
    • DR Mirror Management
    • Online Consistent Copy
    • Inflight Recovery (Recover to ANY Timestamp
    • Encrypted Image Copy
    • Cabinet Copy
    • DB2 9 Support
    RECOVER PLUS for DB2 COPY PLUS for DB2 Log Master ™ for DB2 RECOVERY MANAGER for DB2 Job Generation and Management High Speed Backup Technology Log Analysis Redo/Undo Fast, Dependable Recoveries
  • 9. COPY PLUS for DB2 hours:minutes:seconds DB2 OBJECT: 6 Part Table space Average Row Length 101 2 Secondary Indexes 60 Million Rows 8,644,513 Pages/~33GB BMC OPTIONS USED: Options Maxtasks 6 Output Tcopy Unit CARTVTS, STACK YES, Shrlevel Change, Indexes Yes, Resetmod no, Group Yes IBM OPTIONS USED: Shrlevel Change Copy DDN(Tape1) Parallel (6)
  • 10. 3 Types of Snapshot from BMC
      • Software snapshot
        • Brief outage required for ‘clean’ copy, known to the database
        • Makes a DBMS image copy, typically tape
        • Restore from tape or disk
        • Exploits processor cache
      • Hardware snapshot
        • Uses volume mirrors or data set snaps as source for DBMS copies
        • Brief outage required for ‘clean’ copy, known to the database
        • Restore from tape or disk
      • Instant Snapshot
        • Uses hardware data set snaps to create disk-resident backup
        • Note – NOT a DBMS formatted image copy (but can copy imagecopy)
        • Brief outage required for ‘clean’ copy, known to the database
        • NO OUTAGE for ‘fuzzy’ copy, known to the database
        • Instant Restore from Disk – in SECONDS!!
  • 11. Hybrid Copy Illustration with STACK CABINET BMC COPY Instant Snapshots Single disk Copy dataset IMS/DB2 LOGS BMC RECOVER A Few Large IMS/DB2/VSAM Databases Many Small IMS/DB2/VSAM Databases Recovered Databases
  • 12. Hybrid Copy DB2 Example with STACK CABINET
    • Part cabinet copy, part Instant Snapshot Copy
    • One BMC Copy statement -
    • COPY TABLESPACE DB.* SHRLEVEL CHANGE STACK CABINET
      • OUTSIZE parm drives large objects to DSSNAP, smaller to single cabinet copy dataset
        • Generated copy for 1828 data sets
          • 198 Instant Snapshots, 1630 regular copies to cabinet copy dataset
        • Less than 17 Minutes elapsed time (NO OUTAGE)
        • Very little CPU time
        • Recovery time for entire application – less than 1 hour
          • Without BMC, recover time with DSNUTILB over 360 minutes – 6 HOURS !!
  • 13. Encrypted Image Copies
    • Satisfies need for SOX compliance to protect financial and customer information
    • Encrypted Copies using DES (64bit) or AES (128bit) algorithms
    • KEYDSNAME is created at installation with restricted access
      • Holds key, timestamp, optional algorithm identifier, optional comment
    • Requires BMC® Recovery Management for DB2 Solution
    Joe Blogs 123 45 6789 COPY DB.TS ENCHIPER YES RECOVER DB.TS keydsn Joe Blogs 123 45 6789 Encrypted Output copies or $je Lb*(1 C18 bo 3(7V
  • 14. Online Consistent Copy
      • Used for migration of a consistent set of data
        • Test Database creation
        • Data Warehouse population
      • No outage required
      • Very fast, exploits intelligent storage technology
      • Copy contains only committed data – uncommitted data excluded
      • Supports copying a group of spaces at the same point of consistency
      • Supports multi-dataset non-partitioned spaces
      • Note: CAN be used as input to recovery requiring log apply
  • 15. Online Consistent Copy Log Records DB2 ‘A’ BMC Copy Snap Request Register ‘Online Consistent Copy’ MVS Operating System Storage Device Snaps data set Storage Device Data set Data set DB2 ‘B’ Migrate data to another DB2 with RECOVER TOCOPY WITH OBIDXLAT RECOVER TOCOPY WITH NO LOGAPPLY Data set Apply log records to consistent point
    • OCC may be input to UNLOAD PLUS
    • OCC can be created with normal copy process
    BMC Snapshot
  • 16. DB2 Recovery Manager - Overview
      • ISPF application with DB2 repository tables
        • Access DB2 Recovery Resources and …
          • Group objects for recovery
          • Validate recoverability of objects
          • Specify/Generate recovery jobs
        • Most processes available in Batch
    Recovery Manager Repository Recovery Jobs DB2 Recovery Resources Backup Jobs Disaster Recovery
    • ICF Catalog
    • SYSLGRNG/X
    • SYSCOPY / Image Copies
    • Active Logs
    • Archive Logs
    • BSDS
    • DB2 Catalog
      • Tablespace
      • Indexes
      • RI structures
  • 17. Application Recovery - 101
      • Without Recovery Manager
        • Determine what objects need to be recovered.
          • By Plan, Volume, Database
          • What objects are included in the above
        • Where do I need to recover to?
          • Image Copy (which one)
          • Quiesce Point
          • Current
        • Build the process …
          • One recovery job or multiple
          • Log being used in recovery
          • Recover Indexes or Rebuild
          • Recover Plus Backout an option?
          • Do all spaces actually need to be recovered
        • Run the jobs
          • Sit and wait … hopefully no abends.
  • 18. Application Backup and Recovery Complete Subsystem-wide Backup INITIATORS Generated Balanced Jobs ARMBGPS To Scheduler Scheduled Jobs Copies
  • 19. RECOVER Group Generated jobs SUBMIT DB2 Subsystem RECOVER groups TO Current, Timestamp or RBA INITIATOR Recoveries
  • 20. RECOVER PLUS – Fast Forward Recovery
      • Page built in memory, only written once
      • Simultaneous COPY
      • Simultaneous key extract
    Active Logs Archive Logs Table Space LOG INPUT LOG SORT MERGE Work Dataset Index Space KEY SORT INDEX BUILD INDEX work-area can use memory to reduce I/O Copies Copies TABLE Index Hx034 Hx045 Hx065 Hx0f5 Hx0d4 Hx0e7 Hx0e1 Hx0a2 Full Full Full Inc
  • 21. Another Recovery Resource
      • Extract tablespace UNDO/REDO records for specified objects
      • Sort into merge sequence (same sequence as copy output)
      • R+/CHANGE ACCUM :
        • Preprocesses/sorts log data to optimize log apply
        • Allows all other RECOVER PLUS options
        • No availability outage
        • No Tablespace STOPS, Locks, or Drains
        • Can be used in lieu of frequent copies
    R+/CHANGE ACCUM Job Change Accumulation File
  • 22. Index Copy and Recovery Automation
    • Some Indexes are better Recovered than Rebuilt
        • Non-partitioned Indexes can have LARGE record counts
        • Rebuild requires scan of all PARTs
        • INDEXES can be COPIED and RECOVER can apply Log Records
    • BMC can help
        • Automatically COPY Indexes based on size-threshold
        • Index copies can be Incremental Copies
        • Automatically RECOVER copied Indexes, REBUILD uncopied indexes
          • User does not have to specify recovery type – we decide
    • This can DRAMATICALLY REDUCE recovery time!!
  • 23. Point in Time Recovery – Physical Backout
    • The fastest way to get the database to the point prior to the application error is to remove one hour of records, rather than restoring 23 hours of records.
    • Should BACKOUT fail, automatically do normal Forward Recovery
    1 Hr of Bad Transactions 23 Hours of Good Transactions Backout 1 Hr of Log Image Copy Recovery started Recovery Point
  • 24. Doing Nothing Smarter With XUNCHANGED
    • The Fastest Recovery is the one that can be AVOIDED.
    • How do We Know?
      • SYSIBM.SYSLGRNX tracks Open for Update ranges for all objects
      • Recovery to a Point in Time is usually an ‘application’ event
        • But not all objects in an application get updated every transaction
      • BMC Recovery Management for DB2 solution can…
        • Read SYSIBM.SYSLGRNX to figure out “what has changed”
        • Issue GENJCL BACKUP XUNCHANGED syntax
        • Issue GENJCL RECOVER XUNCHANGED syntax
      • Only application objects that have changed since the designated Point in Time will be recovered – a sometimes dramatic impact
  • 25. TIMESTAMP RECOVERY No QUIESCE required (Forward Flavor) 000000001200 PIT_RBA START_RBA OPTION RECOVERYPOINT TIMESTAMP 2004-04-20-09.00.00.. RECOVER TABLESPACE EMP.PAYROLL UR1 UR2 Pit Range Quiesce at 000000000900 Bad Update at 000000001000 Image copy at 000000000100
  • 26. TIMESTAMP RECOVERY No QUIESCE required (Backout Flavor) 000000001200 PIT_RBA START_RBA OPTION RECOVERYPOINT TIMESTAMP 2004-04-20-09.00.00.. RECOVER TABLESPACE EMP.PAYROLL UR1 UR2 Pit Range Quiesce at 000000000900 Bad Update at 000000001000
  • 27. Mining the DB2 Log Data - Log Master Load Utility SQL Processor DB2 Batch Log Scan Logical Log Repository Reports On-line Interface DML DDL Load File High Speed Apply TABLE Archive Logs Active Logs MemberBSDS Archive Logs Active Logs MemberBSDS Archive Logs Active Logs MemberBSDS Report Writer SQL Generator DDL Generator Load Generator
  • 28. Log Master
      • Allows logical ‘UNDO’ of application transactions via SQL
      • Prevents potential data integrity problems by identifying when UNDO processing will affect updates that were performed later.
      • Provides Data Migration to DB2 or DS databases
      • Display statistics on log activity including analysis of data capture changes impact.
      • Comprehensive reporting
        • Log Information – Audit, Summary, Detail
        • Performance – Commit, Rollback, Image Copy, Data Capture
        • Backout Integrity
        • Miscellaneous – Open Transaction, Quiet Point
      • Support for recovery of Dropped Objects
      • High speed SQL apply feature
        • Conflict Resolution
  • 29. Log Master Output – Reports
    • Miscellaneous Reports
      • Quiet Point
        • Find Physical Quiet Points for filtered objects
        • Optionally insert a QUIESCE into SYSIBM.SYSCOPY for RECOVER purposes
        • DURATION – Added in V310… only report on quiet points greater than or equal to the specified duration
      • Open Transaction
        • What URIDs were still active at the TO point
  • 30. Log Master Output – Reports
    • Information Reports
      • DETAIL
        • All Column data presented
      • AUDIT
        • Index Key Value presented
        • Only reports changed columns for updates
      • SUMMARY
        • No Data, Just INSERT, UPDATE, DELETE counts
        • CSV or SDF Format for Spread Sheet loading/analysis
      • SUMMARY ALL ACTIVITY
        • Avoids much of Log Master overhead
        • Not URID boundary aware
        • Includes Compensation Record counts as well
        • CSV or SDF Format also
  • 31. UNDO - take away only the bad data
      • LOGMASTER for DB2 can apply UNDO SQL to get rid of bad transactions. Database remains online for optimal e-vailability .
    Apply UNDO SQL Bad Transaction Good Transaction 1 Generate UNDO SQL UNDO Bad Transactions Good Transaction 2
  • 32. REDO - re-apply ONLY the good data
      • Customers can perform a point in time recovery and then re-apply good transactions using REDO SQL. Database is briefly offline to recover to consistency, then back online.
    Bad Transaction 1. Generate REDO SQL 3. Apply REDO SQL Good Transaction 1 Good Transaction 2 REDO Good Transaction 2 2. Point-in-time recovery to a quiet point prior to the bad transaction. Recovery started Be sure to generate the REDO SQL BEFORE the RECOVER TO PIT!!
  • 33. Automated Drop Recovery Generates JCL and outputs to automate Drop Recovery DB2 Subsystem
    • UNDO DDL to recreate the dropped object
    • Syntax for recovery and object ID translation
    • DB2 commands to rebind application plans that were invalidated when the object was dropped
    • Drop Recovery Report
    Log Master Technology Scans DB2 Log Records Process is initiated from the online interface Recreates dropped objects Post recovery SQL and Rebind Drive Recovery Technology using copy and log from Dropped Object. OBID Translation Applies log to point of DROP Recovery Plus Technology Log Master Technology
  • 34. DB2 Data Migration
      • Don’t replicate entire files, just migrate the changes!!!
    REPOSITORY Migrated RBA range In-flight URIDs (inflight URID 1988) RBA 2000 RBA 3000 RBA 4000 RBA 1000 Migrated 1000 - 2000 less inflight URID 1988 Migrated 2000 - 3000 plus inflight URID 1988 DB2 LOG Input to LOAD utility or Apply SQL process Migrated 3000 - 4000 Log Master BATCH PGM TABLE TABLE TABLE Logical Log(+1) Log Master BATCH PGM Logical Log(+1) Log Master BATCH PGM Logical Log(+1)
  • 35. Recover Plus Output Options More Than Just Recovery LOGS INPUT IMAGE COPY PROD DB21 TEST DB22 OPTION RECOVERYPOINT TIMESTAMP 2004-04-20-09.00.00.. RECOVER TABLESPACE EMP.PAYROLL OUTCOPY ONLY OUTPUT IMAGE COPY INCOPY OBIDXLAT INDEP OUTSPACE OBIDXLAT DROPRECOVERY INCOPY OBIDXLAT
  • 36. Disaster Recovery
      • Options from weekly dumps to offsite logging
        • Dumps - Simple, cheap, maximum data loss
          • Weekly dumps means several days data loss
        • Remote Mirror - Complex, expensive, no data loss
          • Disk, Network, Software, Facilities, Operations
        • Compromise - Periodic vaulting of Copies & Logs
          • Daily or hourly log shipment will minimize data loss
    Cost Complexity Data Loss Outage Time
  • 37. Disaster Recovery
    • Preparation
    • Generation of recovery JCL for DB2 system tables and BMC tables.
    • Grouping and generation of recovery JCL based on application or other criteria.
    • Recovery Simulation
      • Test and validate your recovery locally.
      • Can provide input into DR planning.
    • DR Mirroring
      • Monitors and reports existence/persistence of remote mirror volumes
      • Mirroring support reflected in JCL Generation.
    • Pick tape lists for recovery copies and logs
  • 38. Recovery Management for DB2 DR support (Sysprog and DBA)
      • Offsite log recovery without complexity
        • Less data loss than weekly dumps
        • Automated process, easy to implement
          • Dialog driven generation of ARM Utilities
    TMS Pull ICF Dump Application Remote Copies (Full/Incremental, Change/Reference) ARMBLOG (Switch Active Log) Truck ARMBARC (copy log) ARMBSRR (Gen System) ARMBGEN (Gen Apps JCL) DB2 Catalog/Directory & BMC RM Repository Remote Full Copies (Change or Reference)
  • 39. Disaster Recovery
    • At the recovery site.
    • Generates necessary JCL to restart DB2 subsystem.
    • Supports mirrored subsystems based on installation mirroring configuration.
    • Recovers system resources and critical BMC repositories (if not mirrored)
    • Generates JCL for recovery of application data based on load balancing and job synchronization
    • Automatically collects statistical data on actual recoveries.
    • Archives recovery statistics for updating history repositories at local site.
  • 40. Remote Site Execution
      • Remote site DB2 startup is easy
        • Assuming all the tapes made it!!
    ICF Restore TMS Restore ARMBSRR Job 1 (CLI, VSAM Allocates Initialize Actives) ARMBSRR Job 2 (Cat/Dir/RM Recoveries) Application Dataset Restores Application Database Recoveries Business Resumption (as of ‘last nights’ ARCHIVE LOG point) Truck
  • 41. Recovery Estimation / Simulation
    • Estimation
    • Predict Recovery Times
    • Based on history maintained for largest tablespaces
    • Current tablespace attributes and artifacts
    • Simulation
    • Validates recovery artifacts through actual execution of the recovery at local site without disruption to ‘real’ data.
    • Provides input to decisions on DR planning.
    • Reporting
    • Online and batch reporting available to view recovery statistics of actual, estimated and simulation runs.
  • 42. BMC Software Recovery Value Physical Backout Timestamp Recovery Transaction Recovery Auto Disaster Recovery E-Net Remote Replication Return On Investment Innovation Migration Replication Audit Reports Encrypt Copies Assess and Improve Recoverability of Data Business Continuity Additional Daily Benefits B&R Options Multi-Vendor Hardware Support Snapshot Copy Rapid Recovery Index B&R Intelligence Hybrid Copy Drop Recovery Recovery Groups Recovery Avoidance
  • 43. BMC Software Recovery Management Value Proposition
    • Reduce or eliminate planned and unplanned outages
      • Improve Application Availability
        • Perform backup while applications are online
        • Perform ‘logical’ recoveries while application are online
        • Perform ‘physical’ recovery with only log input
    • Manage Complexity
      • Automate complicated backup and recovery processes
        • Leverage investment in intelligent storage devices
        • Prepare for local and disaster recovery scenarios
        • Validate recoverability and recover assets
    • Increase staff and resource efficiencies
        • Simplifies the recovery process for the DBAs
        • Assures successful and consistent recovery
        • Utilize computing resources efficiently