Exploiting the logs many possibilities.  Steen Rasmussen Senior Consultant, CA Inc.
Abstract <ul><li>The log is a vital part of DB2 and DB2 recovery.  </li></ul><ul><li>However - once you start to exploit t...
Disclaimer <ul><li>This presentation explains the DB2 logging basics and how DB2 users around the world are using the Log ...
Agenda <ul><li>DB2 Logging fundamentals </li></ul><ul><ul><li>What is logged and not logged </li></ul></ul><ul><ul><li>Dat...
DB2 Logging fundamentals <ul><li>Why use LOGGING – is it necessary ? </li></ul><ul><ul><li>It’s overhead  </li></ul></ul><...
DB2 Logging   Fundamentals  ACTIVE LOG DATASETS ARCHIVE LOG DATASETS UPDATES DB2 RSRCE MGR INSERTS DROPS DB2 Activity LOG ...
DB2 Logging   Fundamentals <ul><li>Activity involving changes to DB2 is documented in the LOG </li></ul><ul><ul><li>Any ch...
DB2 Logging   Fundamentals <ul><li>What is logged when logging occurs ? </li></ul><ul><ul><li>Connection-type   :  TSO, BA...
DB2 Logging   Fundamentals <ul><li>What is NOT logged ? </li></ul><ul><ul><ul><li>SELECT statements </li></ul></ul></ul><u...
Data Capture Changes – Truths and Myths  <ul><li>Truths  </li></ul><ul><ul><li>INSERT - entire row is always logged </li><...
Data Capture Changes – Truths and Myths  <ul><li>Myths  </li></ul><ul><ul><ul><li>Logging will increase too much </li></ul...
Exploiting the DB2-LOG - scenarios  <ul><li>Logging fundamentals covered </li></ul><ul><ul><li>What is stored in the log <...
Audit Control <ul><li>Beside DB2 Recovery – Audit control is probably the theme using the DB2 Log the most. </li></ul><ul>...
Audit Control <ul><li>Another issue related to Sarbanes Oxley – which objects are created, altered, dropped ? </li></ul><u...
Change Propagation <ul><li>Many sophisticated solutions available - many require Log Capture Exit to be active  </li></ul>...
Change Propagation <ul><ul><li>Get a list of OBID’s from the catalog for the tables in interest (remember a table might ha...
Application Quality Assurance <ul><li>Two different scenarios observed where Log processing can optimize daily tasks when ...
Application Quality Assurance <ul><ul><li>For iterative test cases – the tables updated must be backed out in order to con...
<ul><li>Time to look at some   </li></ul><ul><li>HORROR  </li></ul><ul><li>stories </li></ul>
Application Recovery – Intelligent Recover ? <ul><li>Have you ever faced a situation like this: </li></ul><ul><ul><ul><ul>...
Application Recovery – Intelligent Recover <ul><li>An even worse scenario: </li></ul><ul><ul><li>A logic error caused a fe...
Disaster Recovery Scenarios <ul><li>Prepare new application </li></ul><ul><ul><li>About 20 new tables and 100MB table data...
Disaster Recovery Scenarios <ul><li>Recover TORBA - how many hours ? </li></ul><ul><li>Management said unacceptable due to...
Issues of concern <ul><li>We have covered some isolated scenarios where alternate methods of recovery have been selected <...
Trigger Issues <ul><li>Many issues when triggers exist in the environment where the described procedures are used to eithe...
Trigger Issues when triggers exist <ul><li>Procedure when executing log generated SQL: </li></ul><ul><ul><li>If target env...
Identity Columns and Sequences <ul><li>Like triggers – Identity columns and usage of Sequences can complicate the scenario...
<ul><li>Some   </li></ul><ul><li>Miscellaneous  </li></ul><ul><li>scenarios </li></ul>
Track Violations to YOUR Standards <ul><li>Most DB2 sites have either standards, guidelines and/or ROT like : </li></ul><u...
How to Read Log records <ul><li>Instrumentation facility (performance trace) </li></ul><ul><ul><ul><li>Requires DB2 to be ...
Wrap up <ul><li>This presentation should have been done by a number of DB2 users around the world – I have stolen their “t...
Exploiting the logs many possibilities Steen Rasmussen CA Inc. [email_address]
Upcoming SlideShare
Loading in …5
×

Click here for a copy of this presentation.

259 views
158 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
259
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • The initial part of this presentation will cover some of the logging basics. Understanding the basics and what kind of information DB2 registers in the Log is important in order to understand the capabilities. In fact – this presentation should have been done by a number of people around the world. The reason is this presentation covers how different DB2 customers exploit the logs content in a number of ways in order to improve both quality of the daily work as well as solving certain business issues. This presentation does not cover the details related to how logging can be monitored and tuned for optimal performance. No programs are provided in this presentation to achieve the solutions for the scenarios described. Programs can be written exploiting the different DB2 components provided by IBM in order to read and format the log content. For simple issues like auditing this can be accomplished pretty easily – for some of the more sophisticated scenarios provided in this presentation, it might not be so easy.
  • First of all we will describe what kind of information DB2 registers in the log, but more important what kind of information isn’t stored in the DB2 log. We will cover the pros and cons having Data Capture Changes enabled on tables. Many myths exist around this feature – and most DB2 users seem to be scared of using DCC. If TRIGGERS exist in the environment, the task of using DB2 log records for the purposes described throughout this presentation, can be quite difficult and cumbersome. We will discuss the challenges as well as potential solutions to get around the challenges. Once the logging basics are known, several scenarios will be covered in detail illustrating how DB2 customers around the world solve a number of business issues.
  • So far it has not really been an option whether to log or not – but a future DB2 version might change this. Why does DB2 spend CPU and DASD to do logging – not to mention the administrative tasks connected to DB2 logging. Well – just like we’re all spending resources taking image copies, DB2 logging is an insurance. 1) Most users do not think of ROLLBACK using the log, but it is necessary for DB2 to read the UNDO records if an updating transaction abends or the application program issues a ROLLBACK (or the corresponding IBM / CICS command). 2) Any RECOVER utility not specifying TOCOPY will need the log. 3) Logical errors from application programs will he to be backed out. Unless another program can “solve” this situation, a recover will be necessary to recover to the point-in-time prior to the program start. 4) DASD failures hardly happen anymore – but if they do, it will be necessary to recover data. 5) A power failure isn’t very frequent anymore and Emergency Power HW is installed at most sites. 6) A new threat seem to become very important to consider: Terror attacks, fraud etc. seem to be the new “buzz words”.
  • Understanding DB2 logging fundamentals is (in my opinion) a prerequisite in order to exploit the data in the log. Also – not knowing the logging details when exploiting the log content, might cause data integrity issues (more about this in the section dealing with triggers). DB2 will log every INSERT, DELETE and UPDATE as well as some Utilities. DB2 will also log and DDL statements (drop, create, alter) – these can be viewed as DML statements too, since DB2 will have to insert, delete or update the catalog tables. The same is valid for BIND, FREE etc. Any log entry will be placed in a log-buffer which then is written to the ACTIVE log datasets and later archived to the ARCHIVE log. The BSDS (Boot Strap Dataset) holds information about the existing logs, which also is used when a recover is executed. SYSLGRNX holds information of the log-ranges where pagesets (tablespaces and indexes) have been opened for update. This information is also used by recovery in order to speed up recovery by only scanning the logs where the pageset has been opened for update (meaning log-records exist for the tablespace/index). So – beside from image copies – all the listed pieces might be needed when recovery is executed.
  • In general ANY change happening somewhere in DB2 is registered in the log. Even though only one byte is changed in a column, the logging associated with this change might be much MUCH more. If DB2 has to update the index, expand the row, allocate a new page,………. this is logged too. A change resulting in a log-record will create two different log-records. UNDO records are used to ROLLBACK the change and REDO log-record is used when forward recovery is executed. We might say the UNDO information is unnecessary information once the transaction is committed. Even though indexes are NOT defined with COPY YES, DB2 is still logging index updates.
  • Let’s have a look at what DB2 stores in a log-record. This will provide us with information about what information can be retrieved and which business issues we potentially can solve. Details about the “updater” is registered as well as internal identifiers like the RBA (Relative Byte Address) LRSN (Log Record Sequence Number) which are unique identifiers (like a timestamp) identifying when the update happened. DB2 also stores information about the actual object being updated. Note that DB2 doesn’t store the table name, index name etc. Instead the internal identifiers are used, so it is necessary to map these identifiers with the catalog once log-reporting is done – which again can be complicated when a table is dropped – or one table’s ID’s are reused for another table. Finally the type of operation is stored and the data itself. The data changed (before and after image) will be covered in greater detail later).
  • It is important to understand what information DB2 stores in the log-record, but it might be even more important to understand what DB2 does NOT store. SELECT statements are not stored. This can be a major challenge, which will be covered in much more detail later. A user switching auth-id (like SET CURRECT SQLID) is not stored. This is a challenge if the log is used to do AUDIT reporting. DB2 commands like STOP, START, DISPLAY are not stored. Is it a problem – think about auditing again. SMF reports can be created to solve these issues. The PLAN name is stored – unfortunately the package is not. This can be a huge challenge when TRIGGERs are involved. If the package name was stored as part of the log-record, some of the challenges dealing with triggers could be solved by looking in the catalog to verify if the package is a trigger package (SYSIBM.SYSPACKAGE.TYPE=‘T’). DB2 V8 provides a little help, since the log now contains information if the log-record was created from a TRIGGER’ed SQL statement compared to an executed statement (much more about this issue later).
  • A lot of people in the “DB2 World” seem to be afraid of using Data Capture Changes, so let’s go over why people are afraid and what might “convert” these people – DCC might be just another insurance you NEED to pay for, so let’s cover the facts. For INSERT and DELETE statements, the entire row is always logged. For UPDATE statements, DB2 in general logs from the first changed byte to the last changed byte. Exceptions might be when VARCHAR columns are involved or compression is used, where more data than the changed part is logged. The UPDATE statements is where increased logging can be observed. One issue to consider is MASS DELETES. Without DCC, the only logging is the changes to the spacemap pages (a bit is flipped). With DCC enabled, every row is logged and the delete might take considerable longer time depending on the amount of rows, so it might be a good idea to switch the DCC flag before mass deletes (unless you want to be able to “recover” the table using the DB2 log. If the tablespace has the COMPRESSION feature turned on, then the log-records are compressed too, which will complicate the log-reading since the compression dictionary will have to be used. This might NOT be the correct one if REORG / LOAD REPLACE has been executed, since KEEPDUCTIONARY isn’t stored in SYSCOPY.
  • The two major concerns from the DB2 people who are afraid of enabling DCC are, the additional CPU overhead for the transactions when the log-records will grow in size and that the log will become a bottleneck and slow down performance – and as always in DB2 – “IT DEPENDS”. If UPDATE statements represent 15% of the total DML statements, the overhead might be almost zero. Consider if “Best practices” really are used, meaning all updated columns are adjacent. What about VARCHAR rows ? However – most of the DB2 log tend to be written from DB2’s internal activity like page maintenance, checkpoints, being able to backout etc. etc. Also – there probably are a lot of tables where DCC isn’t necessary, so consider only enabling DCC where the need is there – or the potential savings outweigh the insurance penalty. If the benefits of enabling DCC is present – but the penalty is a scary factor – consider the following analysis. Pick a typical application Monitor the CPU and LOGGING resources for a week Enable DCC and monitor another week Compare the results and judge (then management will not kill you)
  • Enough theory – time to illustrate how the DB2 log can be exploited based on DB2 sites around the world. Once you know all the details about DB2 logging – it really pays off to think alternatively when business issues arise.
  • From day one of DB2, the log has primarily been used for recovery purposes, and this is probably the answer most people will use when asked what the log is used for. For many years, the log has also been used to perform audit control – either requested by internal or external auditors. In most cases the SYSADM users are being traced, but also tables having sensitive data. Lately – SOX and other regulatory issues are the “new kids on the block”, where its necessary to perform much deeper analysis of the log to see if anything “unusual” is happening. The future (which is right now) will be more demanding in terms of what the log can be used for. Security management is becoming hot, and it will be necessary to do new types of analysis to look for patterns in “strange behavior” in order to detect or even better be proactive finding anomalies related to the new terms like “identity theft”, “intelligent accounting” etc.
  • Sarbanes Oxley introduces a new set of challenges. Beside from reporting who did what where and when in term of table data updates, it is becoming a requirement to also report changes to the physical environment. The DB2 catalog has columns describing when the last change happened. The problem is it’s not possible to see what changed and what the old value was. Another issue – it is possible to ALTER a parameter/attribute and then reverse it later so there doesn’t seem to be a difference. Also – objects dropped are not kept in the catalog. The DB2 log has everything reported since every CREATE results in INSERTS into the catalog, DROP results in DELETES etc. So by scanning the log for OBID’s where the creator=SYSIBM – it is possible to create an audit trail of every change to any object – including grants, revokes, rebind, free etc.
  • Change propagation is not an uncommon event in DB2 shops, and quite a few software solutions are available. Most of the solutions require DB2’s Log Capture Exit to be active, which quite a few sites are reluctant to use due to additional overhead. Depending on the requirements for change propagation, the DB2 log itself can be used as a cheaper alternative. If there is a need to massage data, create summarizations, merge data from more than one table / source – this might not be a viable solution. If it’s a matter of capturing certain transactions for certain tables and propagate these to a data warehouse or another table (which could reside on another platform) – retrieving the log-records for these tables could be a great alternative. True scenario: A DB2 customer is capturing every INSERT, DELETE and UPDATE for +200 tables. These transactions are all changed into INSERT statements and “loaded” into a data warehouse every night to maintain the history of what has happened over time. TRIGGERS could be an alternative, but performance considerations made them choose this method – especially since they can extract the log-records from the archive logs outside peak hours.
  • If using this “poor mans method”, there are a couple of things to have in mind. The log doesn’t store the table name, so the corresponding OBID’s will have to be retrieved from the catalog, and don’t forget a table’s OBID might be reused if the table is dropped, so processes need to be in place in order to maintain this list. Remember complete images for inserts and deletes are stored. The challenge is updates if DCC isn’t used. To generate the complete before and after image (UPDATE SET xxxxx WHERE xxxx), two different approaches can be used. The RID (record identifier) exists in the log-record, so by reading the log-range from the log-rba to current point-in-time and then reading the active pageset (VSAM dataset) – the complete before and after image can be generated. The other alternative is to read a prior full image copy and then merge the log-records from IC-RBA to the RBA where the log-record was found in order to generate the complete WHERE statement for the update. If DCC is enabled, the entire before and after image for the update is present in the log (makes life easier).
  • This is another scenario being used “in the real world”. QA has always and will always be important. When application developers code new programs or change existing ones, certain DB2 statements against certain tables must be followed in order not to have too many deadlocks and also to ensure tables are being handled in predefined sequence due to RI in place either explicitly or implicitly. The DB2 log-records does store the plan name, so the idea is to have the developer type in the PLAN name on a panel (user-id can be retrieved from the ISPF environment). The “log program” will then read the DB2 log and identify log-records having the plan name and user-id. By reading every log-record having this plan name and generating a report with table names (translated via OBID) and operation (preferable also data), it’s easy to verify the sequence of DB2 operations and which tables are accessed.
  • Another completely different application QA scenario. When application programs are tested – do they always work 100% correct the first time ? Probably not ! In order to do iterative tests, it might be necessary to have the same data foundation in order to conduct the next test case after having corrected the application program. One way to accomplish this is to RELOAD the data – or RECOVER the involved objects to an RBA prior to the start of the application program. This works great if it’s a single user environment – but the implications for other users using the conventional method might not be optimal. Very often many users share the same tables but different data – so exploiting the DB2-log in an alternate way might lead to better quality and more throughput. Like the previous example – a panel can be created allowing the user to type in PLAN and/or USER. The “log program” should read the UNDO records (created by DB2 in case of an abort/rollback) and create SQL statements. This means the INSERT statements done by the program should be transformed to DELETE and visa versa, and updates should be done with reversed SET and WHERE. Once these SQL statements are generated they can be executed where the application program was executed. The result is no outage for other users and probably a lot quicker compared to LOAD REPLACE or RECOVER
  • This is another true life story which has similarities with the previous example. A batch job was executed over night. By accident 80K rows were deleted from a production table. No system RI was defined, but application RI did exist to other DB2 tables as well as to IMS DB. The job executed with RC=0. Later in the morning, end users started to get “funny screens” and application errors. It was quickly verified that rows in a table was missing. A PIT recovery to an RBA prior to the batch job was out of the question due to online processing and lost updates after the batch job which would be gone too. The log program was used to read all the log-records deleted by the specific PLAN, and the data was re-inserted while online users still were working. This is a very good approach and a quick alternative to a Disaster Recovery project – as long as you know the consequences: Do any applications “make decisions” based on the occurrence of a row in the table ? This was DELETES – for updates and inserts you need to check for updates to the rows AFTER the point-in-time you are “recovering” to.
  • So far we have been focused on certain PLAN names, TABLE names, USER-id’s. Before you make a decision what to extract from the log – you might need to consult the catalog for tables related to the one in interest. Let’s illustrate this by another example: Another batch program deleted a couple of rows in a table. No big deal. This table was involved in a RI spider-web of more than 100 tables. No big deal. The REAL problem was, it was THE parent table, so more than a million rows were deleted by DB2 due to the RI Delete cascade option. The process of restoring data is identical to the previous example, except the fact that you will need to look into the DB2 catalog to find ALL the tables which potential could have been impacted, and then retrieve the OBID’s for these as well prior to accessing the log to read the log-records needed. Again – you will need to consider the application logic before using this approach.
  • This is a bullet proof alternative to recovery when the circumstances are right. A new application was supposed to go live Monday morning. New tables, new programs, changed programs, existing tables, a mix of DB2 and IMS – and a load of data. Management decided a group of end-users should test the application Sunday to make sure everything was working. In case the end-users were going to give the project a “NO-GO” – everything updated should be removed. Well – only a handful of users doing a few thousands online transactions – can’t be that big a deal ?!?!?!?!? Well – both IMS DB and more than 150 DB2 tables were involved. Recovering these objects would make the application unavailable for many hours. A decision was made to isolate the access to the application environment prior to the
  • A “normal” recovery scenario and preparation would have been to execute imagecopies and QUIESCE all involved objects prior to the test (which was done as well – always prepare for the worst). The estimated recovery time was several hours (and what if some recover jobs would fail) but management didn’t like the outage due to online applications and the “lost work” from end-users. A decision was made to use an alternate recovery method but ALSO to prepare for the “normal” recovery. The environment was isolated so only certain users were allowed to login and all databases which were supposed to not being touched were started in read-only. The application was tested and it did become a “GO”. For future purposes, it was decided to try the alternate recovery method, so all transactions were read from the log (UNDO) to see if the method would have worked. Only a few thousands of updates, deletes and inserts were made – generating the reversed statements were done by reading the log within a few minutes. It would have taken a few minutes to execute these transactions – so this “backout approach” could have done the recovery in a matter of minutes compared to hours.
  • The past slides have illustrated how the log can be used as an alternative to recovery, change propagation – and for audit control. Before we all get too excited – a few issues need to be addressed. Depending on the environment, some of these scenarios covered might be way more complicated than viewed at a first glance.
  • A few releases back in DB2, a new long awaited feature became available – TRIGGERS. Triggers are used at many DB2 sites, and it’s a very nice feature – but it really make the scenarios covered more complicated. Until V8 was introduced, it wasn’t possible to identify if a DML statement in the log came from a trigger or not. V8 sets a flag in the log-record if the executed statement came from a trigger. This helps in some cases but not every situation can be satisfied. What we really need is an extension to DB2 to disable triggers (for example a new BIND parameter). Except for Online Load – utilities do not activate triggers, so it is extremely important to understand the environment when the log scenarios described in this presentation are used, so let’s have a closer look at what is happening and what can be done to avoid introducing disasters.
  • When log-records are extracted from the log (source environment has triggers defined) and need to be applied to a target environment, two different scenarios exist depending on whether the target environment has triggers defined or not. If the target environment has no triggers defined, the only issue to consider is if the log-records created in the source environment residing from triggers should be executed (and this is only possible in DB2 V8). If the target environment has triggers defined or the target environment is not V8 – some decisions have to be made since the log-records retrieved already do have the triggered SQL executed. If the retrieved log-records are executed without doing anything – and the target environment has the same triggers defined – the triggers will be fires again, resulting in SQL statements executed twice !!!!! What if the target environment has DIFFERENT triggers ??????? One way to deal with this challenge of duplicating triggered SQL is to save the DDL for the target environment triggers, drop the triggers, execute the SQL retrieved from the log and then create the triggers again. This approach might not be applicable due to the outage since the environment must be “frozen” while the process is ongoing in order not to compromise integrity.
  • Another issue to consider is when IDENTITY columns exist (and SEQUENCES in DB2 V8). Delete and Update statements might not be a problem. Insert statements seem to be the biggest challenge. If GENERATED ALWAYS is used, duplicate numbers is a big challenge. Also - when IDENTITY or SEQUENCE numbers are used to drive RI, the SQL executed does not have the application logic in place and integrity is gone. If GENERATED DEFAULT is used, the log generated SQL might be used unless the target environment has a next available number greater than what was executed in the source environment. I really don’t have a good solution or answer……….
  • Most DB2 sites have rules-of-thumb defined how often an application program must commit and how often image copies must be taken, so the log can be used to report on violations. The idea of finding these has two main points of interest: To minimize backout time if the application abends and to minimize recovery time. In order to find how many updates/deletes/inserts an application is doing between commits, traverse through the log and count the log-records per PLAN / User-id combination. By also adding a counter for every table accessed (OBID) – this report can be even more granular. Another idea is to count log-records between image copies in order to reduce recovery time, where the log apply plays an important role.
  • Click here for a copy of this presentation.

    1. 1. Exploiting the logs many possibilities. Steen Rasmussen Senior Consultant, CA Inc.
    2. 2. Abstract <ul><li>The log is a vital part of DB2 and DB2 recovery. </li></ul><ul><li>However - once you start to exploit the content and what the log holds, many day to day tasks can be changed for the better for everyone. This presentation will look into real life scenarios how DB2 sites are squeezing the most out of the log to manage audit control, application recovery, change propagation, application QA and assisting in disaster scenarios. </li></ul>
    3. 3. Disclaimer <ul><li>This presentation explains the DB2 logging basics and how DB2 users around the world are using the Log content to perform various tasks – opinions are strictly my own. </li></ul><ul><li>This presentation does not cover Log Monitoring – nor does it provide any guidelines for how to tune DB2 logging. </li></ul><ul><li>No programs/samples are provided in order to accomplish what is listed in this presentation. A number of ISV’s and IBM provide software to accomplish what is outlined – and the patient user can of course write these programs too. No recommendations are made to chose any potential solution. </li></ul>
    4. 4. Agenda <ul><li>DB2 Logging fundamentals </li></ul><ul><ul><li>What is logged and not logged </li></ul></ul><ul><ul><li>Data Capture Changes (DCC) truths and myths </li></ul></ul><ul><ul><li>Trigger challenges </li></ul></ul><ul><li>How can DB2 Log records be used to optimize the daily job </li></ul><ul><ul><li>Audit Control </li></ul></ul><ul><ul><li>Change Propagation </li></ul></ul><ul><ul><li>Application Recovery </li></ul></ul><ul><ul><li>Application Quality Assurance </li></ul></ul><ul><ul><li>Disaster Recovery Scenarios </li></ul></ul><ul><ul><li>Violations to your standards (IC, commit, rollback) </li></ul></ul>
    5. 5. DB2 Logging fundamentals <ul><li>Why use LOGGING – is it necessary ? </li></ul><ul><ul><li>It’s overhead </li></ul></ul><ul><ul><li>It costs in terms of performance, DASD, administration, clean-up ……. </li></ul></ul><ul><ul><li>It’s an insurance – just in case we get an accident </li></ul></ul><ul><li>In a perfect world </li></ul><ul><ul><li>No need to ROLLBACK </li></ul></ul><ul><ul><li>No need to RECOVER </li></ul></ul><ul><ul><li>No program errors </li></ul></ul><ul><ul><li>No hardware errors </li></ul></ul><ul><ul><li>No power failures </li></ul></ul><ul><ul><li>No hurricanes, terror attacks, fraud, ……………… </li></ul></ul><ul><li>Let’s get the MOST out of the LOG since it’s here ! </li></ul>Used to be the threat
    6. 6. DB2 Logging Fundamentals ACTIVE LOG DATASETS ARCHIVE LOG DATASETS UPDATES DB2 RSRCE MGR INSERTS DROPS DB2 Activity LOG BUFFER BSDS DB2 SYSLGRNX Some ingredients NEEDED for RECOVERY UTILITIES And more
    7. 7. DB2 Logging Fundamentals <ul><li>Activity involving changes to DB2 is documented in the LOG </li></ul><ul><ul><li>Any change to DATA pages (REDO and UNDO information) </li></ul></ul><ul><ul><ul><li>Insert, Delete, Update </li></ul></ul></ul><ul><ul><ul><li>Any activity performed by DB2 </li></ul></ul></ul><ul><ul><li>Any change to INDEX pages (REDO and UNDO information) </li></ul></ul><ul><ul><ul><li>ICOPY doesn’t matter </li></ul></ul></ul><ul><ul><ul><li>Any activity performed by DB2 </li></ul></ul></ul><ul><ul><li>UNDO information </li></ul></ul><ul><ul><ul><li>If ROLLBACK issued </li></ul></ul></ul><ul><ul><ul><li>If abend, SQL error etc. </li></ul></ul></ul><ul><ul><ul><li>Once COMMIT executed – this information is “no longer needed” </li></ul></ul></ul><ul><ul><li>GRANT, REVOKE, BIND, FREE, DROP, CREATE, ALTER,…. </li></ul></ul><ul><ul><ul><li>Any catalog change is logged as well </li></ul></ul></ul><ul><ul><ul><li>In reality these commands are INSERTS, DELETES and UPDATES </li></ul></ul></ul><ul><ul><li>Some utility records, imagecopy info for system objects, …… </li></ul></ul>
    8. 8. DB2 Logging Fundamentals <ul><li>What is logged when logging occurs ? </li></ul><ul><ul><li>Connection-type : TSO, BATCH, UTILITY, CICS, IMS </li></ul></ul><ul><ul><li>Connection-id : established by attachment facility CAF : TSO/DB2CALL TSO : TSO/BATCH </li></ul></ul><ul><ul><li>Correlation-id : associated with DB2 thread </li></ul></ul><ul><ul><li>Auth-id : who is executing this transaction </li></ul></ul><ul><ul><li>Plan name : The PLAN under which this operation executes </li></ul></ul><ul><ul><li>RBA / LRSN : Timestamp of operation </li></ul></ul><ul><ul><li>URID : Unit-of-Recovery id </li></ul></ul><ul><ul><li>DBID, PSID and OBID : Internal identifiers for the object updated </li></ul></ul><ul><ul><li>Operation type : Insert, Update, Delete, Utility type,…… </li></ul></ul><ul><ul><li>Data changed : (much more about this later) </li></ul></ul>
    9. 9. DB2 Logging Fundamentals <ul><li>What is NOT logged ? </li></ul><ul><ul><ul><li>SELECT statements </li></ul></ul></ul><ul><ul><ul><li>Auth-id switching </li></ul></ul></ul><ul><ul><ul><li>SQLID assignments </li></ul></ul></ul><ul><ul><ul><li>Access denied </li></ul></ul></ul><ul><ul><ul><li>DB2 Commands </li></ul></ul></ul><ul><ul><ul><li>STOP and START operations </li></ul></ul></ul><ul><ul><ul><li>PACKAGE name (also TRIGGER package) </li></ul></ul></ul><ul><ul><ul><li>If Update, Delete, Insert derived from a TRIGGER (V8 solves this) This can be a huge challenge – we will discuss TRIGGER issues later </li></ul></ul></ul>SMF reporting can be used to track these Events.
    10. 10. Data Capture Changes – Truths and Myths <ul><li>Truths </li></ul><ul><ul><li>INSERT - entire row is always logged </li></ul></ul><ul><ul><li>UPDATE in general - from first changed byte to last changed </li></ul></ul><ul><ul><ul><li>If VARCHAR present - from first changed byte to end of row </li></ul></ul></ul><ul><ul><ul><li>V7 changed this : if LENGTH not changed, logging as if no VARCHAR (unless compression or EDITPROC) </li></ul></ul></ul><ul><ul><ul><li>If DCC enabled – entire before and after image is logged </li></ul></ul></ul><ul><ul><li>DELETE </li></ul></ul><ul><ul><ul><li>Qualified (DELETE from tb WHERE ……..) </li></ul></ul></ul><ul><ul><ul><ul><li>Every row deleted is always logged in its entirety </li></ul></ul></ul></ul><ul><ul><ul><li>Unqualified (aka. MASS delete : DELETE FROM tb) and NO RI </li></ul></ul></ul><ul><ul><ul><ul><li>Without DCC - only logging of changes to spacemap pages. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>With DCC – every row is deleted and logged individually. Might consider to disable/enable DCC in programs doing mass deletes. </li></ul></ul></ul></ul><ul><ul><li>If tablespace compressed – log-records are compressed </li></ul></ul><ul><ul><li>The benefit of having Data Capture Changes enabled on tables will be covered later </li></ul></ul>ALTER TABLE tbcr.tbnm DATA CAPTURE CHANGES ; ALTER TABLE tbcr.tbnm DATA CAPTURE NONE ;
    11. 11. Data Capture Changes – Truths and Myths <ul><li>Myths </li></ul><ul><ul><ul><li>Logging will increase too much </li></ul></ul></ul><ul><ul><ul><li>CPU overhead will increase too much </li></ul></ul></ul><ul><li>Consider the following </li></ul><ul><ul><ul><li>Ratio between Inserts, Deletes and Updates ? </li></ul></ul></ul><ul><ul><ul><li>Are updated columns placed next to each other ? </li></ul></ul></ul><ul><ul><ul><li>Is compression used ? </li></ul></ul></ul><ul><ul><ul><li>Any varchar columns updated ? </li></ul></ul></ul><ul><li>Also – only a smaller portion of the log comes from DML </li></ul><ul><ul><ul><li>Checkpoint records </li></ul></ul></ul><ul><ul><ul><li>Pageset allocation and summary records </li></ul></ul></ul><ul><ul><ul><li>Exception statuses </li></ul></ul></ul><ul><ul><ul><li>Backout information </li></ul></ul></ul><ul><ul><ul><li>Index updates </li></ul></ul></ul><ul><ul><ul><li>Etc. etc </li></ul></ul></ul><ul><li>Having the entire BEFORE and AFTER image (Data Capture Changes turned on) can be a good insurance – just like imagecopy is (measure one week without and one week with DCC enabled) </li></ul>Maybe NOT – it DEPENDS My benchmarks show DML log records take up about 25-30%
    12. 12. Exploiting the DB2-LOG - scenarios <ul><li>Logging fundamentals covered </li></ul><ul><ul><li>What is stored in the log </li></ul></ul><ul><ul><li>What is not stored in the log </li></ul></ul><ul><ul><li>When is logging done </li></ul></ul><ul><ul><li>What is Data Capture Changes </li></ul></ul><ul><ul><li>Data Capture Changes impact on DELETE and UPDATE </li></ul></ul><ul><ul><li>Logging impact based on physical attributes (compression) </li></ul></ul><ul><li>Real life scenarios </li></ul><ul><ul><li>What is the “DB2 World” doing out there ………… </li></ul></ul><ul><ul><li>How can the DB2 log do for you </li></ul></ul><ul><ul><li>Time for alternative ways of thinking </li></ul></ul>
    13. 13. Audit Control <ul><li>Beside DB2 Recovery – Audit control is probably the theme using the DB2 Log the most. </li></ul><ul><li>Sarbanes Oxley and other regulations will not make this issue less important and the Log less used </li></ul><ul><ul><li>Which user-id changed WHAT and WHEN </li></ul></ul><ul><ul><li>Tracking of any changes to sensitive data </li></ul></ul><ul><ul><li>SYSADM is very powerful – many auditors get less paranoid when reporting of any activity implemented </li></ul></ul><ul><ul><li>“Steen’s crystal ball” : </li></ul></ul><ul><ul><ul><li>Pattern analysis, excessive logging against certain objects, changed behavior, anomalies etc. </li></ul></ul></ul><ul><ul><ul><li>Fraud and “Identity Theft” will be a key driver of log analysis </li></ul></ul></ul>
    14. 14. Audit Control <ul><li>Another issue related to Sarbanes Oxley – which objects are created, altered, dropped ? </li></ul><ul><ul><ul><li>DB2 catalog shows when object was created - LAST TIME </li></ul></ul></ul><ul><ul><ul><li>DB2 catalog shows when object was altered - LAST TIME </li></ul></ul></ul><ul><ul><ul><li>DB2 catalog does NOT show which objects have been dropped </li></ul></ul></ul><ul><ul><li>Everything is in the Log in order to report </li></ul></ul><ul><ul><li>Traverse through the log and find changes to the DB2 catalog in order to have a complete log for Object Change Management </li></ul></ul>
    15. 15. Change Propagation <ul><li>Many sophisticated solutions available - many require Log Capture Exit to be active </li></ul><ul><li>The Log itself can be used as “poor mans change propagator” </li></ul><ul><ul><li>When no massaging of data </li></ul></ul><ul><ul><li>When no synchronization with IMS, VSAM and other files </li></ul></ul><ul><ul><li>Do not forget: </li></ul></ul><ul><ul><ul><li>Check if URID looked at has been committed (CLR records exist for Rollback) </li></ul></ul></ul><ul><ul><ul><li>Even though a table has been dropped – the log-records are still in the log </li></ul></ul></ul><ul><ul><ul><li>MASS-deletes might be a challenge if DCC isn’t enabled. </li></ul></ul></ul>
    16. 16. Change Propagation <ul><ul><li>Get a list of OBID’s from the catalog for the tables in interest (remember a table might have been dropped and OBID re-used) </li></ul></ul><ul><ul><li>Traverse through the Log starting where you last stopped </li></ul></ul><ul><ul><li>Get the REDO log-records </li></ul></ul><ul><ul><ul><li>INSERTs and DELETEs : the entire row image is ready </li></ul></ul></ul><ul><ul><ul><li>UPDATEs when DCC : BEFORE image used for WHERE clause and AFTER IMAGE used for UPDATE SET clause. </li></ul></ul></ul><ul><ul><ul><li>UPDATEs without DCC : Use RID from log-record and look at active VSAM dataset for table. Log must be read from current-point-in-time and back to URID to check if RID has been updated after the current log-record. Alternative is to read the imagecopy and then read the log from IC-RBA and forward to URID currently under investigation (see why Data Capture Changes can be a good choice ? ) </li></ul></ul></ul><ul><ul><ul><li>Map column names to log-record content. </li></ul></ul></ul><ul><ul><ul><li>Create the format you like – if SQL DML, these can be applied on any RDBMS </li></ul></ul></ul>
    17. 17. Application Quality Assurance <ul><li>Two different scenarios observed where Log processing can optimize daily tasks when developing programs. </li></ul><ul><ul><li>Ensure SQL statements executed in correct sequence and the proper tables and columns are manipulated. </li></ul></ul><ul><ul><li>Develop a panel where the developer can type PLAN name and optional date/time </li></ul></ul><ul><ul><li>Locate the PLAN name and the USER-ID </li></ul></ul><ul><ul><li>Retrieve the REDO log-records matching the criteria and produce a report and/or SQL statements to be verified by the developer or project team </li></ul></ul><ul><ul><li>Table names as input (beside PLAN and USER-id) is not recommended in order NOT to forget any table updates due to Referential Integrity </li></ul></ul>
    18. 18. Application Quality Assurance <ul><ul><li>For iterative test cases – the tables updated must be backed out in order to conduct the same test after program changes completed: </li></ul></ul><ul><ul><li>LOAD REPLACE or RECOVER TORBA is a solution ???? </li></ul></ul><ul><ul><li>If many tables or loads of data – this approach is time consuming AND other applications cannot access the tables </li></ul></ul><ul><ul><li>Develop a panel where the developer can type PLAN name and date/time </li></ul></ul><ul><ul><li>Locate the PLAN name and the USER-ID. </li></ul></ul><ul><ul><li>Retrieve the log-records matching the criteria and produce the SQL statements to be executed in order to re-establish the data to the PIT immediately prior to the start of the PLAN specified. </li></ul></ul><ul><ul><li>Since SQL statements are created to “backout” the PLAN, these can be executed concurrently with other activity. </li></ul></ul><ul><ul><li>Table names as input (beside PLAN and USER-id) is not recommended in order NOT to forget any table updates due to Referential Integrity </li></ul></ul>
    19. 19. <ul><li>Time to look at some </li></ul><ul><li>HORROR </li></ul><ul><li>stories </li></ul>
    20. 20. Application Recovery – Intelligent Recover ? <ul><li>Have you ever faced a situation like this: </li></ul><ul><ul><ul><ul><li>A batch job is executed with “wrong” SQL statements. Many hours later someone finds out data is missing in a table. It appears that the job by accident deleted 80K rows. Only application RI involved. Recovery not feasible due to online activity (no desire to re-enter transactions added after “the mess”). </li></ul></ul></ul></ul><ul><ul><ul><ul><li>All the DELETEs are on the log. We know the name of the plan used in the batch job, and we even know the table name where the deletes happened. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Like the previous example – we can traverse through the log and create INSERT statements based on the log-records which DB2 produced when these famous deletes were executed. This will allow us to do an “Online Recovery” with no impact. </li></ul></ul></ul></ul><ul><ul><li>WARNING – ACHTUNG : </li></ul></ul><ul><ul><li>Application logic based on existence of a row ? </li></ul></ul>
    21. 21. Application Recovery – Intelligent Recover <ul><li>An even worse scenario: </li></ul><ul><ul><li>A logic error caused a few rows to be deleted from one table. </li></ul></ul><ul><ul><li>Not a big deal ……… well </li></ul></ul><ul><ul><li>This table was involved in a +100 table RI structure </li></ul></ul><ul><ul><li>Not a big deal ……….well </li></ul></ul><ul><ul><li>This table was THE parent table with DELETE CASCADE </li></ul></ul><ul><ul><li>Again – this was observed hours after the “disaster” and subsequent batch jobs and online applications had done 1000’s of updates. </li></ul></ul><ul><ul><li>This scenario was evaluated and a choice between two evils had to be done. The least evil approach was picked and deleted rows retrieved from the log and re-applied </li></ul></ul><ul><ul><li>Necessary to investigate application logic before this attempt is made – but a complete outage for hours was avoided </li></ul></ul>
    22. 22. Disaster Recovery Scenarios <ul><li>Prepare new application </li></ul><ul><ul><li>About 20 new tables and 100MB table data </li></ul></ul><ul><ul><li>Application integrated with existing applications </li></ul></ul><ul><ul><li>A total of 150 tables and 1 TB data </li></ul></ul><ul><ul><li>IMS DB involved </li></ul></ul><ul><li>Sunday chosen to test new application </li></ul><ul><ul><li>20 selected people to test online system with real transactions and real data for a couple of hours </li></ul></ul><ul><ul><li>Only a few thousands DML statements estimated to be executed – but 150 tables and IMS DB involved ….. </li></ul></ul><ul><ul><li>Sunday afternoon GO or NOGO based on the test </li></ul></ul><ul><ul><li>If NOGO – make the “test” invisible </li></ul></ul>
    23. 23. Disaster Recovery Scenarios <ul><li>Recover TORBA - how many hours ? </li></ul><ul><li>Management said unacceptable due to business and SLA !!! </li></ul><ul><li>Alternate strategy: </li></ul><ul><ul><ul><li>Very few updates, deletes and inserts </li></ul></ul></ul><ul><ul><ul><li>Environment was “isolated” to only allow testers </li></ul></ul></ul><ul><ul><ul><li>Read the log like previous example: </li></ul></ul></ul><ul><ul><ul><ul><li>INSERTS converted to DELETES </li></ul></ul></ul></ul><ul><ul><ul><ul><li>DELETES converted to INSERTS </li></ul></ul></ul></ul><ul><ul><ul><ul><li>UPDATES reversed to before image </li></ul></ul></ul></ul><ul><li>Almost like the good old IMS Batch Backout </li></ul>Log read in RBA sequence
    24. 24. Issues of concern <ul><li>We have covered some isolated scenarios where alternate methods of recovery have been selected </li></ul><ul><li>We have seen how the DB2 log can be used for many other purposes than just recovery. </li></ul><ul><li>Now it’s time to sit down and think carefully about the consequences </li></ul>
    25. 25. Trigger Issues <ul><li>Many issues when triggers exist in the environment where the described procedures are used to either re-execute or backout SQL statements using SQL statements. </li></ul><ul><li>DB2 V7 does NOT log if DML is from a trigger – DB2 V8 does set a flag, but this only solves SOME issues. </li></ul><ul><li>I (you too ?) need a parameter to specify IF triggers should be executed or not. </li></ul><ul><li>Utility approach is different – except for ONLINE LOAD, all utilities skip trigger processing. </li></ul><ul><li>Important to understand what is happening when triggers are involved when the described processes are used </li></ul><ul><li>Let us look at the challenges and how these can be addressed. </li></ul>
    26. 26. Trigger Issues when triggers exist <ul><li>Procedure when executing log generated SQL: </li></ul><ul><ul><li>If target environment has NO triggers, the only issue is to consider IF SQL extracted from the log should hold SQL residing from TRIGGERED statements or only the REAL EXECUTED statements (remember this is a V8 only possibility). When reading the log, the statements originating from triggers can be bypassed. </li></ul></ul><ul><ul><li>If target environment has triggers OR it’s a pre-V8 environment: </li></ul></ul><ul><ul><ul><li>The challenge is – the statements generated from the log hold both the original executed SQL statements AND the triggered SQL statements </li></ul></ul></ul><ul><ul><ul><li>If nothing is done – the triggers will be fired again when executing the SQL generated from the log (double transactions) </li></ul></ul></ul><ul><ul><ul><li>Save the target environment triggers (in the sequence they were created), and drop them prior to executing the log generated statements. </li></ul></ul></ul><ul><ul><ul><li>Upon completion : create triggers again </li></ul></ul></ul><ul><ul><ul><li>Outage might NOT be acceptable </li></ul></ul></ul>
    27. 27. Identity Columns and Sequences <ul><li>Like triggers – Identity columns and usage of Sequences can complicate the scenarios of using the DB2 log as an alternate recovery method </li></ul><ul><ul><li>DELETE statements might not be a problem. </li></ul></ul><ul><ul><li>UPDATE statements neither </li></ul></ul><ul><ul><li>INSERT statements probably the biggest challenge </li></ul></ul><ul><ul><ul><li>When the target environment is not 100% identical in terms of data </li></ul></ul></ul><ul><ul><ul><li>Especially if generated ALWAYS is used </li></ul></ul></ul><ul><ul><ul><li>When target is ahead of “next available” number </li></ul></ul></ul><ul><ul><ul><li>When RI is involved using Identity columns / Sequences </li></ul></ul></ul>
    28. 28. <ul><li>Some </li></ul><ul><li>Miscellaneous </li></ul><ul><li>scenarios </li></ul>
    29. 29. Track Violations to YOUR Standards <ul><li>Most DB2 sites have either standards, guidelines and/or ROT like : </li></ul><ul><ul><li>Application commit frequency (plan level) </li></ul></ul><ul><ul><li>Updates per table between commit points (plan level and lock escalation) </li></ul></ul><ul><ul><li>Number of updates per tablespace between imagecopies (extended recovery time) </li></ul></ul><ul><li>The log holds: </li></ul><ul><ul><li>Every DML statement </li></ul></ul><ul><ul><li>Commit points </li></ul></ul><ul><ul><li>PLAN name </li></ul></ul><ul><ul><li>OBID for tables </li></ul></ul><ul><li>Traversing the log can simply have these counters in order to find the offenders so outage can be minimized for: </li></ul><ul><ul><li>Application abends / rollbacks </li></ul></ul><ul><ul><li>Recovery time “when” needed </li></ul></ul>
    30. 30. How to Read Log records <ul><li>Instrumentation facility (performance trace) </li></ul><ul><ul><ul><li>Requires DB2 to be up </li></ul></ul></ul><ul><ul><ul><li>Performance traces provide overhead (for some unacceptable) </li></ul></ul></ul><ul><ul><ul><li>Volume of output can be huge </li></ul></ul></ul><ul><li>IBM supplied macro DSNJSLR </li></ul><ul><ul><ul><li>Can be used while DB2 is inactive </li></ul></ul></ul><ul><li>DB2 Log Capture Exit </li></ul><ul><ul><ul><li>Executes in real time while DB2 is up </li></ul></ul></ul><ul><ul><ul><li>Very critical for performance </li></ul></ul></ul><ul><li>Highlvl.SDSNSAMP(DSNDQJ00) </li></ul><ul><ul><ul><li>Describes Log records </li></ul></ul></ul><ul><li>The lacy method </li></ul><ul><ul><li>Vendor Products </li></ul></ul>
    31. 31. Wrap up <ul><li>This presentation should have been done by a number of DB2 users around the world – I have stolen their “true life horror stories” as well as some ideas to how you – the Audience – hopefully can benefit from and become even more efficient and “save the day” for your employer, and clean up the mess when it’s there </li></ul><ul><li>Enjoy the conference and have a great day </li></ul>
    32. 32. Exploiting the logs many possibilities Steen Rasmussen CA Inc. [email_address]

    ×