Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Write-Behind Logging
Course Instructor : Dr.Jahangard
Presented By : Pouyan Rezazadeh
2
HDDs
A durable storage device
Fast sequential access to data (block-oriented)
High capacity
Low price per GB
Write-Behin...
3
SSDs
A durable storage device
Fast sequential access to data (block-oriented)
Up to 1000ร— faster than HDDs
Almost high c...
4
DRAMs
Fast access to fine-grained data (byte-level)
Up to 1000ร— faster than SSDs
Low capacity
Write-Behind Logging
oNon-...
5
NVMs
A durable storage device
Fast access to fine-grained data (byte-level)
High capacity
Write-Behind Logging
oMore exp...
6
NVM vs SSD & HDD
Synchronous writes to a large file (64 GB)
Synchronous write: Data is written in the order it is update...
7
How to use NVM in DBMS
How to use NVM in a DBMS running on a
hybrid storage hierarchy with both DRAM
and NVM?
Optimizing...
8
How to use NVM in DBMS
Using NVM only for storing the log and
managing the database still on disk
A more cost-effective ...
9
Write-Ahead Logging(WAL)
ARIES protocol, the most well-known recovery method based
on WAL
Our discussion is focused on d...
10
Write-Ahead Logging(WAL)
Write-Behind Logging
Log
Checkpoint
Table Heap
Volatile Storage
Durable Storage
11
Write-Ahead Logging(WAL)
The DBMS records the versioning metadata alongside the
tuple data
Determine whether a tuple ve...
12
Tuple Version Meta-data
Write-Behind Logging
Tuple ID Txn ID Begin CTS End CTS Prev V Data
101
102
103
-
-
-
1001
1001
...
13
Structure of WAL Record
Write-Behind Logging
Checksum LSN
Log Record
Type
Transaction
Commit
Timestamp
Table Id
Insert
...
14
Dirty Page Table (DPT)
A disk-oriented DBMS maintains two meta-data tables at
runtime that it uses for recovery
Dirty p...
15
Active transaction table (ATT)
The second table tracks the status of the running transactions
This table records the LS...
16
WAL Recovery Protocol
Write-Behind Logging
Oldest Active
Transaction
Earliest Redo
Log Record
Checkpoint End of Log
Log...
17
Group Commit
Most DBMSs use group commit to minimize the I/O
overhead
It batches the log records for a group of transac...
18
Disadvantage of WAL
Although WAL supports efficient transaction processing
when durable storage cannot support fast ran...
19
Write-Behind Logging(WBL)
The changes made by transactions are guaranteed to be
already present on durable storage befo...
20
Write-Behind Logging(WBL)
It records the timestamp of the latest committed transaction
whose changes and updates of pri...
21
Write-Behind Logging(WBL)
Ignores the changes of the transactions whose commit
timestamp is later than ๐‘ ๐‘ and earlier ...
22
Structure of WBL Record
Write-Behind Logging
Checksum LSN Log Record Type
Persisted Commit
Timestamp( )
Dirty Commit
Ti...
23
Write-Behind Logging(WBL)
Write-Behind Logging
Log
Table Heap
Volatile Storage
Durable Storage
Table Heap
24
Write-Behind Logging(WBL)
For long running transactions that span a group commit, the
DBMS also records their commit ti...
25
Write-Behind Logging(WBL)
To reduce overhead, the DBMSโ€™s garbage collector thread
periodically scans the database
Garba...
26
WBL Commit Timestamp Gaps
Write-Behind Logging
Time
Group Commit
Gaps{ } {(101,199)} {(101,199)}
{(101,199),
(301,399)}...
27
WBL Recovery Protocol
Write-Behind Logging
Checkpoint End of Log
Log (Time)
Analysis
28
WBL Recovery Protocol
Each WBL log record contains all the information needed for
recovery:
The list of commit timestam...
29
Features of the System
Intel Labโ€™s hardware emulator that contains two Intel Xeon
E5-4620 CPUs (2.6 GHz)
Each with 8 co...
30
Evaluation
Write-Behind Logging
31
Evaluation
Write-Behind Logging
32
Evaluation
Write-Behind Logging
33
Evaluation
Write-Behind Logging
34
Write-Behind Logging
Upcoming SlideShare
Loading in โ€ฆ5
×

Write behind logging

834 views

Published on

A look at write behind logging to find out what it does

Published in: Data & Analytics
  • Be the first to comment

Write behind logging

  1. 1. Write-Behind Logging Course Instructor : Dr.Jahangard Presented By : Pouyan Rezazadeh
  2. 2. 2 HDDs A durable storage device Fast sequential access to data (block-oriented) High capacity Low price per GB Write-Behind Logging Slow for random access to data
  3. 3. 3 SSDs A durable storage device Fast sequential access to data (block-oriented) Up to 1000ร— faster than HDDs Almost high capacity Write-Behind Logging oSlow for random access to data oFixed number of times to be written to o3-10ร— more expensive per GB vs HDDs
  4. 4. 4 DRAMs Fast access to fine-grained data (byte-level) Up to 1000ร— faster than SSDs Low capacity Write-Behind Logging oNon-Volatile oMore expensive per GB vs HDDs
  5. 5. 5 NVMs A durable storage device Fast access to fine-grained data (byte-level) High capacity Write-Behind Logging oMore expensive per GB vs HDDs
  6. 6. 6 NVM vs SSD & HDD Synchronous writes to a large file (64 GB) Synchronous write: Data is written in the order it is updated Write-Behind Logging
  7. 7. 7 How to use NVM in DBMS How to use NVM in a DBMS running on a hybrid storage hierarchy with both DRAM and NVM? Optimizing the storage methods for NVM Improves both the DBMS performance and the lifetime of the storage device However, cannot be employed in a hybrid storage hierarchy, as they target an NVM-only system Write-Behind Logging
  8. 8. 8 How to use NVM in DBMS Using NVM only for storing the log and managing the database still on disk A more cost-effective solution Only leverages the low-latency sequential writes of NVM Does not exploit its ability to support random writes and fine-grained data access It is better to employ logging and recovery algorithms that are designed for NVM Write-Behind Logging Write-Behind Logging
  9. 9. 9 Write-Ahead Logging(WAL) ARIES protocol, the most well-known recovery method based on WAL Our discussion is focused on disk-oriented DBMSs that use the multi-version concurrency control (MVCC) During normal operations, the DBMS records transactionsโ€™ modifications in a durable log before transferring data to database in disk To recover a database after a restart, the DBMS periodically takes checkpoints at runtime Write-Behind Logging
  10. 10. 10 Write-Ahead Logging(WAL) Write-Behind Logging Log Checkpoint Table Heap Volatile Storage Durable Storage
  11. 11. 11 Write-Ahead Logging(WAL) The DBMS records the versioning metadata alongside the tuple data Determine whether a tuple version is visible to a transaction When a transaction starts, the DBMS assigns it a unique transaction identifier from a monotonically increasing global counter When a transaction commits, the DBMS assigns it a unique commit timestamp by incrementing the timestamp of the last committed transaction Write-Behind Logging
  12. 12. 12 Tuple Version Meta-data Write-Behind Logging Tuple ID Txn ID Begin CTS End CTS Prev V Data 101 102 103 - - - 1001 1001 1002 โˆž โˆž โˆž - - 101 Txn ID: The transaction identifier that currently holds a latch on the tuple Begin CTS: The commit timestamp from which the tuple becomes visible End CTS: The commit timestamp after which the tuple is not visible Prev V: Reference to the previous version (if any) of the tuple A tuple is visible to a transaction if and only if its last visible commit timestamp falls within the BeginCTS and EndCTS fields of the tuple. 1002 305 302-
  13. 13. 13 Structure of WAL Record Write-Behind Logging Checksum LSN Log Record Type Transaction Commit Timestamp Table Id Insert Location Delete Location Before Image After Image
  14. 14. 14 Dirty Page Table (DPT) A disk-oriented DBMS maintains two meta-data tables at runtime that it uses for recovery Dirty page table (DPT) & Active transaction table (ATT) DPT contains the modified pages that are in DRAM but have not been propagated to durable storage Each modified page has an entry in the DPT that marks the log recordโ€™s LSN of the oldest transaction that modified it To restore the page (Redo) during recovery Write-Behind Logging
  15. 15. 15 Active transaction table (ATT) The second table tracks the status of the running transactions This table records the LSN of the latest log record of all active transactions To undo the changes during recovery Write-Behind Logging
  16. 16. 16 WAL Recovery Protocol Write-Behind Logging Oldest Active Transaction Earliest Redo Log Record Checkpoint End of Log Log (Time) Analysis Redo Undo During the redo phase, the DBMS skips replaying the log records associated with uncommitted transactions.
  17. 17. 17 Group Commit Most DBMSs use group commit to minimize the I/O overhead It batches the log records for a group of transactions in a buffer Then flushes them together with a single write to durable storage Improves the transactional throughput Write-Behind Logging
  18. 18. 18 Disadvantage of WAL Although WAL supports efficient transaction processing when durable storage cannot support fast random writes, it is inefficient for NVM storage The DBMS first records the tupleโ€™s contents in the log, and it later propagates the change to the database The logging algorithm can avoid this unnecessary data duplication Write-Behind Logging
  19. 19. 19 Write-Behind Logging(WBL) The changes made by transactions are guaranteed to be already present on durable storage before they commit The recovery algorithm can scan the entire database to identify the dirty modifications This is prohibitively expensive and increases the recovery time Tracking two commit timestamps in the log Write-Behind Logging
  20. 20. 20 Write-Behind Logging(WBL) It records the timestamp of the latest committed transaction whose changes and updates of prior transactions are safely persisted on durable storage (๐‘ ๐‘). It records the commit timestamp that the DBMS promises to not assign to any transaction before the next group commit finishes (๐‘ ๐‘‘, where ๐‘ ๐‘ < ๐‘ ๐‘‘). When the DBMS restarts after a failure, it considers all the transactions with commit timestamps earlier than ๐‘ ๐‘ as committed Write-Behind Logging
  21. 21. 21 Write-Behind Logging(WBL) Ignores the changes of the transactions whose commit timestamp is later than ๐‘ ๐‘ and earlier than ๐‘ ๐‘‘ In other words, if a tupleโ€™s begin timestamp falls within the (๐‘ ๐‘, ๐‘ ๐‘‘) pair Then the DBMSโ€™s transaction manager ensures that it is not visible to any transaction that is executed after recovery Write-Behind Logging
  22. 22. 22 Structure of WBL Record Write-Behind Logging Checksum LSN Log Record Type Persisted Commit Timestamp( ) Dirty Commit Timestamp( )
  23. 23. 23 Write-Behind Logging(WBL) Write-Behind Logging Log Table Heap Volatile Storage Durable Storage Table Heap
  24. 24. 24 Write-Behind Logging(WBL) For long running transactions that span a group commit, the DBMS also records their commit timestamps in the log (๐‘ ๐‘, ๐‘ ๐‘‘) pair commit timestamp gap The set of commit timestamp gaps that the DBMS needs to track increases on every system failure Write-Behind Logging
  25. 25. 25 Write-Behind Logging(WBL) To reduce overhead, the DBMSโ€™s garbage collector thread periodically scans the database Garbage collector undoes the dirty modifications associated with the currently present gaps Once all the modifications in a gap have been removed by the garbage collector The DBMS stops checking for the gap in tuple visibility checks and no longer records it in the log Write-Behind Logging
  26. 26. 26 WBL Commit Timestamp Gaps Write-Behind Logging Time Group Commit Gaps{ } {(101,199)} {(101,199)} {(101,199), (301,399)} { } Garbage Collection 101 199
  27. 27. 27 WBL Recovery Protocol Write-Behind Logging Checkpoint End of Log Log (Time) Analysis
  28. 28. 28 WBL Recovery Protocol Each WBL log record contains all the information needed for recovery: The list of commit timestamp gaps The commit timestamps of long running transactions that span across a group commit operation The DBMS only needs to retrieve this information during the analysis phase of the recovery process Write-Behind Logging
  29. 29. 29 Features of the System Intel Labโ€™s hardware emulator that contains two Intel Xeon E5-4620 CPUs (2.6 GHz) Each with 8 cores and a 20 MB L3 cache 256 GB of DRAM Dedicates 128 GB of DRAM for the emulated NVM HDD: Seagate Barracuda (3 TB) SSD: Intel DC S3700 (400 GB) Write-Behind Logging
  30. 30. 30 Evaluation Write-Behind Logging
  31. 31. 31 Evaluation Write-Behind Logging
  32. 32. 32 Evaluation Write-Behind Logging
  33. 33. 33 Evaluation Write-Behind Logging
  34. 34. 34 Write-Behind Logging

ร—