WalB: Block-level WALfor Efficient Incremental Backup            Dec 2, 2010         Takashi HOSHINO         Cybozu Labs, ...
Contents• Motivation• WalB  – Architecture  – Main Algorithm  – Data Format  – Pros and Cons• Current Progress• Summary
Motivation• There is no good backup solution   –   Online   –   Small performance overhead   –   Supports various applicat...
Requirements inside Cybozu• Guarantee backup interval   – within 5-10min• Keep multiple backup archives   – also in remote...
WalB• A wrapper block device driver   – Data device to store data   – Log device to store WAL (write-ahead log)• Related u...
System Architecture with WalB                 Applications App           Database System                 File System  OS  ...
Backup vs Mirroring                             Backup             Mirroring                        Making snapshot,      ...
WalB Architecture WalB Dev                 Any Application             WalB Log Controller          (File System, DBMS, et...
WalB Functionalities• Online incremental backup• Online consistent full backup• Snapshot creation/deletion   – not accessi...
Incremental Backup with WalBPrimary Server   Backup Server 1   Backup Server 2     LV1           LV1 bkp                  ...
Log Transfer  Primary Server                Backup Server 1                Backup Server 2      Log 1                     ...
Consistent Full Backup with WalB              Primary Server             Backup Server                               (A)  ...
Read/Write Algorithm (1)Request Queue                                   Write   •Generate logpack header                  ...
Read/Write Algorithm (2)Request Queue                                  Write     •Generate logpack header                 ...
Parallel Task Processing             Request              Queue                 • Data write must start after log write co...
WalB Data Format• Data device  – The same image as wrapping block device• Log device  – Overview  – Snapshot metadata  – R...
Log Device Format                                                                          Address             Snapshot   ...
Snapshot MetadataSnapshotMetadata                                               …                 Snapshot Header         ...
Ring BufferRing buffer                                        start_offset                                                ...
Logpack HeaderLogpack Header      u32      u16          u16                 checksum   # of IO   Total IO size            ...
Pros and Cons                WalB              Snapshot              Snapshot             (redo log)          with redo lo...
Current Progress•   Survey and study linux kernel programming•   Design of rough architecture•   Prototype implementation•...
Summary• Motivaion  – No good backup solution    covering various applications• WalB  – Is a block device driver with WAL ...
Upcoming SlideShare
Loading in …5
×

WalB: Block-level WAL. Concept.

871 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
871
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

WalB: Block-level WAL. Concept.

  1. 1. WalB: Block-level WALfor Efficient Incremental Backup Dec 2, 2010 Takashi HOSHINO Cybozu Labs, Inc.
  2. 2. Contents• Motivation• WalB – Architecture – Main Algorithm – Data Format – Pros and Cons• Current Progress• Summary
  3. 3. Motivation• There is no good backup solution – Online – Small performance overhead – Supports various applications – Cost-effective• We need – Consistent full backup – Incremental backup – Block-level backup – To use commodity hardware and free software only
  4. 4. Requirements inside Cybozu• Guarantee backup interval – within 5-10min• Keep multiple backup archives – also in remote site• Keep multiple snapshots – per 1day for 1week
  5. 5. WalB• A wrapper block device driver – Data device to store data – Log device to store WAL (write-ahead log)• Related user-land tools – Device controller – Log extractor• Target OS and architecture – Latest Linux kernel (2.6.36) – x86_64 host
  6. 6. System Architecture with WalB Applications App Database System File System OS WalB For Backup Software RAID1 For Mirroring Storage Volume0 Volume1
  7. 7. Backup vs Mirroring Backup Mirroring Making snapshot, RAID1, Typical methods Logging writes Replication Operation miss,Recoverable failures Failure of facilities Application bugCan keep latest data? No Yes We need both functionalities to save data from lost.
  8. 8. WalB Architecture WalB Dev Any Application WalB Log Controller (File System, DBMS, etc) ExtractorControl Read Write LogWrapperBlock Device(WalB Dev) Any Block Device Any Block Device for Data (Data device) for Log (Log device) Not special format An original format
  9. 9. WalB Functionalities• Online incremental backup• Online consistent full backup• Snapshot creation/deletion – not accessible due to no index• Volume resize – not resize of log capacity
  10. 10. Incremental Backup with WalBPrimary Server Backup Server 1 Backup Server 2 LV1 LV1 bkp LV1 bkp LV1 bkp LV1 bkp LV1 bkp1 LV1 bkp2 LV2 LV1 bkp LV1 bkp LV1 bkp LV1 bkp LV2 bkp1 LV2 bkp2 … … …
  11. 11. Log Transfer Primary Server Backup Server 1 Backup Server 2 Log 1 Log 1 (lzoped) Log 1 (lzoped) Log 2 Log 2 (lzoped) Log 3 Log 4• Each log is compressed with lzop and transferred via ssh/rsh.• Proxy may be useful for pipelined transfer to multiple sites.• Delete original log in primary server after all backup servers have a replica.
  12. 12. Consistent Full Backup with WalB Primary Server Backup Server (A) WalB Device Full Archive (online) (inconsistent) Apply (C) (B) Log LogStart Get log from Get consistentfull backup t0 to t1 Image at t1 t0 t1 t2 Time (A) (B) (C)
  13. 13. Read/Write Algorithm (1)Request Queue Write •Generate logpack header •Generate request for log device Read •Issue the request Wait completion•Generate and issue to the data device •Generate request for data device Wait completion •Issue the request•Send completion for upper layer Wait completion •Send completion for upper layer •Update written_lsid Simple Algorithm •Free logpack header buffer
  14. 14. Read/Write Algorithm (2)Request Queue Write •Generate logpack header •Allocate data buffer and copy data for Read data device write •Generate request for log device •Issue the request•Search PendingTree Wait completion If all data are in the tree Then copy data and send completion •Insert to PendingTreefor upper layer •Send completion for upper layer Else generate request for lack data and •Generate request for data deviceissue the request to the data device •Issue the request Wait completion Wait completion•If all requests have finished •Delete from PendingTreeThen send completion for upper layer •Free data buffer for data device write •Update written_lsidFaster but more complex than (1) •Free logpack header buffer
  15. 15. Parallel Task Processing Request Queue • Data write must start after log write completion • Send completion for write must be serialized Select Generate read/write logpack Logpack Logpack Datapack Generate Queue Completion Completion read request Queue Queue Write Write Wait completion, Read Submit logpack logpack Generate datapack Queue logpack Submit read request, Datapack Wait completion, Queue Wait completion, Send completion Update written_lsid, Send completion Write WriteThis is of algorithm (1) Write logpack logpack datapack
  16. 16. WalB Data Format• Data device – The same image as wrapping block device• Log device – Overview – Snapshot metadata – Ring buffer – Logpack
  17. 17. Log Device Format Address Snapshot Ring buffer metadata • Snapshot metadata Superblock – The size is determined at device creation. (SECTOR_SIZE) • Ring buffer – Stores write-ahead log.Reserved (PAGE_SIZE) – The size is determined at device creation. – The size can not be changed.PAGE_SIZE = 4096 bytesSECTOR_SIZE = 512 or 4096 bytes
  18. 18. Snapshot MetadataSnapshotMetadata … Snapshot Header • Snapshot header contains checksum andSnapshot u32 u32 allocation bitmap Sector checksum bitmap • Lsid: Log sequence id. • Snapshot record size is 80 bytes. (6 records in 512 byte Snapshot sector) Record u64 u64 u8[64](fixed size) lsid timestamp name
  19. 19. Ring BufferRing buffer start_offset Log pack with oldest_lsid Logpack 1st 2nd Log pack Header written data written data … SECTOR_SIZE
  20. 20. Logpack HeaderLogpack Header u32 u16 u16 checksum # of IO Total IO size 1st 2nd 3rd … Log Record u64 lsid; /* Log sequence id */ u64 offset; /* IO offset by the sector. */ u16 lsid_local; /* local sequence id as the data offset in the log record. */Log Record u16 size; /* IO size by the sector. */ u16 is_exist; /* 0 if this record does not exist. */ u16 reserved1;
  21. 21. Pros and Cons WalB Snapshot Snapshot (redo log) with redo log with undo logRead No overhead Index search Bitmap search Write twice (1) BitmapWrite Index modification search/modification No overhead (2) and old data copyTypical --- ZFS, BtrFS LVM Soft. WalB + Index ~= Block device with snapshot management
  22. 22. Current Progress• Survey and study linux kernel programming• Design of rough architecture• Prototype implementation• Basic evaluation and redesign if required• Implementation of full functionalities and test• Operation inside Cybozu• Publication as GPLv2• Merging to device-mapper if required• Merging to main repository (hopefully)
  23. 23. Summary• Motivaion – No good backup solution covering various applications• WalB – Is a block device driver with WAL – Provides efficient incremental backup

×