Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Transaction unit 1 topic 4


Published on

Published in: Technology
  • Be the first to comment

Transaction unit 1 topic 4

  1. 1. Crash Recovery <ul><li>The recovery manager is responsible for ensuring two Atomicity and durability </li></ul><ul><li>Atomicity is ensured by undoing the actions of transactions that do not commit </li></ul><ul><li>Durability by making sure that all actions of committed transactions survive system crashes and media failures </li></ul><ul><li>System crash may be an error in bus/OS failure </li></ul><ul><li>Media failure may be disk is corrupted </li></ul>
  2. 2. Logging <ul><li>Basic Idea: Logging </li></ul><ul><li>Log: An ordered list of REDO/UNDO actions </li></ul><ul><li>Record REDO and UNDO information, for every update, in a log. </li></ul><ul><li>Sequential writes to log (put it on a separate disk).Minimal info (diff) written to log, so multiple updates fit in a single log page. </li></ul><ul><li>Log record contains: <, pageID, offset, length, old data, new data>  and additional control info </li></ul>
  3. 3. Introduction to ARIES <ul><li>ARIES is a recovery algo designed to work with a steal, no-force approach used(no-force approach means that some of these changes may not have been written to disk at the time of subsequent crash), when recovery mgr. is invoked after a crash, restart & proceeds in 3 phases:- </li></ul><ul><ul><li>Analysis :indentifies dirty pages in the buffer pool </li></ul></ul><ul><ul><li>Redo : repeats all actions, starting from an appropriate point in the log & restores the db state to what it was at the time of crash </li></ul></ul><ul><ul><li>Undo : undoes the actions of transactions that didn’t commit , so that db reflects only the actions of committed transactions </li></ul></ul>
  4. 4. No force and Steal approach
  5. 5. ARIES contd… . <ul><li>ARIES is a state of the art recovery method </li></ul><ul><ul><li>Incorporates numerous optimizations to reduce overheads during normal processing and to speed up recovery </li></ul></ul><ul><li>ARIES uses :- </li></ul><ul><ul><li>Uses log sequence number (LSN) to identify log records </li></ul></ul><ul><ul><ul><li>Stores LSNs in pages to identify what updates have already been applied to a database page </li></ul></ul></ul><ul><ul><li>Physiological redo </li></ul></ul><ul><ul><li>Dirty page table to avoid unnecessary redos during recovery </li></ul></ul><ul><ul><li>Fuzzy checkpointing that only records information about dirty pages , and does not require dirty pages to be written out at checkpoint time </li></ul></ul>
  6. 6. ARIES Recovery Algorithm <ul><li>Db transaction recovery focuses on the different methods used to recover a db from an inconsistent state to a consistent state by using the data in the transaction log </li></ul><ul><li>Four imp.concepts that affect the recovery process:- </li></ul><ul><ul><ul><li>Write ahead log protocol : the transaction logs are written before any db data are actually updated. This protocol ensures that , in case of a failure, the db can later be recovered to a consistent state, using data in the transaction log </li></ul></ul></ul><ul><ul><ul><li>Redundant transaction logs : most dbms keep several copies of the transaction log to ensure that a physical disk failure will not impair the DBMS’s ability to recover data </li></ul></ul></ul><ul><ul><ul><li>Db Buffers : a buffer is a area in primary memory used to speed up disk operations. To improve processing time,the DBMS s/w reads the data from physical and stores a copy of it on a “buffer” in primary memory. </li></ul></ul></ul><ul><ul><ul><ul><li>When a transaction gets executed, while updation, the copy of data in the buffer gets updated </li></ul></ul></ul></ul><ul><ul><ul><li>Db checkpoints : checkpoint is an operation in which db writes all of its updated buffers to disk. Checkpoints are automatically scheduled by DBMS several times per hour. </li></ul></ul></ul><ul><ul><ul><ul><li>It plays an imp.role in transaction recovery </li></ul></ul></ul></ul><ul><ul><ul><li>The db recovery process involves bringing the db to a consistent state after a failure </li></ul></ul></ul>
  7. 7. Write-Ahead Logging (WAL) <ul><li>The Write-Ahead Logging Protocol: </li></ul><ul><ul><li>Must force the log record for an update before the corresponding data page gets to disk. </li></ul></ul><ul><ul><li>Must write all log records before commit. </li></ul></ul><ul><li>#1 guarantees Atomicity </li></ul><ul><li>#2 guarantees Durability. </li></ul>
  8. 8. WAL & the Log <ul><li>Each log record has a unique Log Sequence Number (LSN). </li></ul><ul><li>LSNs always increasing. </li></ul><ul><li>Each data page contains a pageLSN. </li></ul><ul><li>The LSN of the most recent log record for an update to that page. </li></ul><ul><li>System keeps track of flushedLSN. </li></ul><ul><li>The max LSN flushed so far. </li></ul>
  9. 9. The log <ul><li>The log is known as trail or journal . It’s a history of actions executed by DBMS. The log is a file of records stored in a stable storage , which is assumed to survive crashes </li></ul><ul><li>The durability is achieved by maintaining two or more copies of the log on diff.disks, so that a chance of all copies of logs are lost is a rare case </li></ul><ul><li>The most recent portion of log is called as log trail, kept in main memory </li></ul><ul><li>Every log record is given a unique id called the log sequence no.(LSN). As with any record id, we can fetch a log record with one disk access given the LSN </li></ul><ul><li>Further , LSNs are given nos. monotonically increasing order , this’s required for ARIES recovery algo </li></ul><ul><li>If the log is a sequential file, in principle, growing indefinitely , the LSN can simply be the address of the first byte of the log record </li></ul><ul><li>Various techq.used to identify portion of the log which are ‘too old’ to be needed again to bound the amount of stable storage used for log </li></ul><ul><li>For recovery procedure, every page in the db contains LSN of the most recent log record , this LSN is called the pageLSN </li></ul><ul><li>Every log record has fields: prevLSN , transID (ID of transaction generating a log record) and type . </li></ul><ul><li>The set of log records is maintained as a linked list and its accessed by prevLSN field, this list is updated whenever a log record is added. </li></ul>
  10. 10. Log record with actions <ul><li>Updating a page- after modification,an update type record is appended to log trail </li></ul><ul><li>Commit – when a transaction decides to commit, it forces-write a commit type log record containing transaction id is appended to the log </li></ul><ul><li>Abort- when a transaction is aborted, an abort type log record containing transaction id is appended to the log </li></ul><ul><li>End – all transactions committed/aborted are appended to the log with end type </li></ul><ul><li>Undoing an update- when a transaction is rolled back,its updates are undone. Then , a compensation log record or CLR is written </li></ul>
  11. 11. Update log Record & Compensation Log Record(CLR) <ul><li>Update log Record : the PageID field is page id of the modified page, the length in bytes and the offset of change are also included. </li></ul><ul><li>The Before image is the value of changed bytes before the change </li></ul><ul><li>the After image is the value after the change </li></ul><ul><li>An Update log record contains both before and after image can be used to redo the change and undo the change </li></ul><ul><li>A redo-only update log record contains just the after-image , undo-only update record contains just the before-image </li></ul><ul><li>CLR: It is written just before the change recorded in an update log record U is undone </li></ul><ul><li>Such undo can happen during normal system execution when a transaction is aborted or during recovery from a crash </li></ul><ul><li>it describes the actions taken to undo the actions recorded in the corresponding update log record and is appended to the log tail just like any other log record </li></ul><ul><li>It contains a field as undoNextLSN which is LSN of the next log record which is to be undone for the transaction that wrote update record U;this field in C is set to the value of prevLSN in U. </li></ul>
  12. 12. Other Recovery Related Structure <ul><li>In addition to the log,the following two table contain imp.recovery related structure:- </li></ul><ul><ul><li>Transaction Table: it contains one entry for each active transaction. The entry contains transaction id, the status and a field called lastLSN which is the LSN of the most recent log record for this transaction. The status can be whether a transaction is in progress, comitted or aborted </li></ul></ul><ul><ul><li>Dirty Page table : this table contains one entry for each dirty page in the buffer pool. The entry contains a field recLSN, which is the LSN of the first log record that caused the page becomes dirty. This LSN identifies earlier log record that might have to be redone for this page during restart from a crash </li></ul></ul><ul><ul><li>During normal tans.operation, these table are maintained by transaction manager and buffer manager. </li></ul></ul><ul><ul><li>During restart after a crash,these tables are reconstructed in the Analysis phase of restart </li></ul></ul>
  13. 13. Write ahead log protocol <ul><li>WAL is the fundamental rule ensures that a record of every change to the db is available while attempting to recover from a crash </li></ul><ul><li>When a transaction is changed & committed, the log tail is forced to stable storage, even if no-force approach is being used(no-force approach means that some of these changes may not have been written to disk at the time of subsequent crash) </li></ul><ul><li>If a force approach is used,all the pages modified by the transaction, rather than a portion of the log that includes all its records,must be forced to disk when the transaction commits. </li></ul><ul><li>The set of all changed pages is typically much larger than the log tail because the size of all update log record is close to (twice) the size of changed bytes, which is smaller than the page size </li></ul><ul><li>The log is maintained in sequential file, hence all writes are done in a seql.manner </li></ul><ul><li>Cost of forcing the log tail is much smaller than the cost of writing all changed pages to disk </li></ul>
  14. 14. Checkpointing <ul><li>It’s a like a snapshot of the DBMS state, by taking checkpoints periodically, DBMS can reduce the work to be done during restart in the event of a subsequent crash </li></ul><ul><li>Checkpointing ARIES has 3 steps:- begin_checkpoint,end_checkpoint and fuzzy checkpoint </li></ul><ul><li>Begin checkpoint record is written to indicate when the checkpoint starts </li></ul><ul><li>End checkpoint record is constructed, including in it the current contents of the transaction table & dirty page table and appended to the log </li></ul><ul><li>3 rd step is executed after end_checkpoint record is written to stable storage </li></ul><ul><li>Special master record containing LSN of the begin_checkpoint log record is written to a known place on stable storage. </li></ul><ul><li>While end_checkpointing record is being constructed,the DBMS continues executing transactions and writing other log records </li></ul>
  15. 15. Checkpointing <ul><li>Checkpointing is done as follows: </li></ul><ul><ul><li>Output all log records in memory to stable storage </li></ul></ul><ul><ul><li>Output to disk all modified buffer blocks </li></ul></ul><ul><ul><li>Output to log on stable storage a < checkpoint L > record. </li></ul></ul><ul><li>Transactions are not allowed to perform any actions while checkpointing is in progress. </li></ul><ul><li>Fuzzy checkpointing allows transactions to progress while the most time consuming parts of checkpointing are in progress </li></ul><ul><ul><li>Performed as described on next slide </li></ul></ul>
  16. 16. Fuzzy checkpointing… <ul><li>Fuzzy checkpointing is done as follows: </li></ul><ul><ul><li>Temporarily stop all updates by transactions </li></ul></ul><ul><ul><li>Write a < checkpoint L > log record and force log to stable storage </li></ul></ul><ul><ul><li>Note list M of modified buffer blocks </li></ul></ul><ul><ul><li>Now permit transactions to proceed with their actions </li></ul></ul><ul><ul><li>Output to disk all modified buffer blocks in list M </li></ul></ul><ul><ul><ul><li>blocks should not be updated while being output </li></ul></ul></ul><ul><ul><ul><li>Follow WAL: all log records pertaining to a block must be output before the block is output </li></ul></ul></ul><ul><ul><li>Store a pointer to the checkpoint record in a fixed position last _ checkpoint on disk </li></ul></ul><ul><li>When recovering using a fuzzy checkpoint, start scan from the checkpoint record pointed to by last _ checkpoint </li></ul><ul><ul><li>Log records before last _ checkpoint have their updates reflected in database on disk, and need not be redone. </li></ul></ul><ul><ul><li>Incomplete checkpoints, where system had crashed while performing checkpoint, are handled safely </li></ul></ul>
  17. 17. Some more notes on checkpointing.. <ul><li>Periodically, the DBMS creates a checkpoint, in order to minimize the time taken to recover in the event of a system crash. Write to log: </li></ul><ul><li>begin_checkpoint record: Indicates when chkpt began. </li></ul><ul><li>end_checkpoint record: including current contents of transaction table and dirty page table </li></ul><ul><li>This is a ` fuzzy checkpoint ’: continue to run; so these tables accurate only as of the time of the begin_checkpoint record. </li></ul><ul><li>No attempt to force dirty pages to disk; effectiveness of checkpoint limited by oldest unwritten change to a dirty page. (So it’s a good idea to periodically flush dirty pages to disk!) </li></ul><ul><li>Store LSN of chkpt record in a safe place (master record). </li></ul>
  18. 18. Recovering from system crash <ul><li>When a system is restarted after a crash, the recovery mgr procceds in 3 phases:- </li></ul><ul><ul><li>Analysis – examines the most recent begin_checkpoint record , whose LSN is denoted by C </li></ul></ul><ul><ul><li>Redo - follows analysis and redoes all the changes to any page that might have been dirty at the time of crash;this set of pages and the starting pt.for Redo are determined during analysis </li></ul></ul><ul><ul><li>Undo - undo phase follows Redo and undoes the changes of all transactions active at the time of crash. This set of transactions is identified at during the analysis phase </li></ul></ul><ul><ul><li>Redo reapplies changes in the order in which they were originally carried out; Undo reverses the changes in the opposite order, reversing the most recent changes first </li></ul></ul>
  19. 19. Analysis phase <ul><li>It performs 3 tasks:- </li></ul><ul><ul><li>It determines the point in the log at which to start the Redo pass </li></ul></ul><ul><ul><li>It determines pages in the buffer pool that were dirty at the time of crash </li></ul></ul><ul><ul><li>It identifies transactions that were active at the time of crash and must be undone </li></ul></ul><ul><ul><li>This phase begins by examining the most recent begin_checkpt log record and initializing the dirty page table and transaction table to the copies of those structures in the next end_checkpoint record </li></ul></ul><ul><ul><li>Thus these tables are initialized to the set of dirty pages and active transactions at the time of checkpoint </li></ul></ul>
  20. 20. Redo phase <ul><li>ARIES reapplies the updates of all transactions, committed or otherwise </li></ul><ul><li>If a transaction was aborted before the crash and its updates were undone,as indicated by CLRs, the actions described in CLR are also reapplied </li></ul><ul><li>This reapplication mechanism distinguishes ARIES from other proposed WAL-based recovery algo and causes the db to be brought to the same state it was in at the time of the crash </li></ul><ul><li>The Redo phase starts with the log record has smallest recLSN of all pages in the dirty page table constructed by Analysis phase because this log record identifies the oldest update which may not have been written to the disk prior to the crash </li></ul><ul><li>Starting from this log record,Redo scans forward until the end of the log </li></ul><ul><li>For each redoable log record encountered,Redo checks whether the logged action must be redone </li></ul><ul><li>The action must be redone unless one of the following conditions holds:- </li></ul><ul><ul><li>The affected page is not in the dirty page table </li></ul></ul><ul><ul><li>The affected page may be in the dirty page table,but the recLSN for the entry is > than LSN of the log record being checked </li></ul></ul><ul><ul><li>The pageLSN > or = to the LSN of the log record being checked </li></ul></ul>
  21. 21. Undo phase <ul><li>The undo phase scans backward from the end of the log </li></ul><ul><li>The goal is to undo actions of all transactions active at the time of the crash , i.e., to effectively abort them </li></ul><ul><li>Undo begins with the transaction table constructed by Analysis phase which identifies all transactions active at the time of the crash, includes LSN of most recent log record(the last LSN field) for each such transaction </li></ul><ul><li>Such transactions are called as loser transactions </li></ul><ul><li>All actions of loser must be undone & even must be undone in the reverse of the order in which they appear in the log </li></ul><ul><li>Consider the set of lastLSN values for all loser transactions </li></ul><ul><li>Undo repeatedly chooses the target (i.e. most recent) LSN value this set and processes it,until ToUndo is empty. To process a log record, </li></ul><ul><ul><li>If it is a CLR and undoNextLSN value is not NULL, the undoNextLSN values is added to the set ToUndo. If the undoNextLSN is null,an end record is written for the transaction because it is completely undone and CLR is discarded </li></ul></ul><ul><ul><li>If it is an update record,a CLR is written and the corresponding action is undone </li></ul></ul><ul><ul><li>When the set toUndo is empty, the Undo phase is complete. Restart is now complete, and the system can proceed with normal operations </li></ul></ul>