0
Vmbkp: An Online Backup Tool   for VMware vSphere          Oct 15, 2010        HOSHINO Takashi        Cybozu Labs, Inc.   ...
What is Vmbkp?• Backup software for Virtual Machines  in VMware vSphere environment   –   Online full/differential/increme...
Supported platform• VMware vSphere 4  – vCenter server managing several ESX(i)s  – Single ESX(i) (not tested)  – Free ESXi...
Hardware Architecture                          Control/GetInfo with                         vSphere Soap Protocol       VM...
Commands• Update:   – Get and save information of all available VMs• Backup:   – Execute backup of the specified vm/group ...
Commands –cont.• Destroy:   – Remove a virtual machine from vSphere environment• Clean:   – Delete archives of virtual mac...
Workflow      Backup                    Restore    Prepare config              Prepare config   (Register to cron)  Read c...
Configuration files• Global (required)   – Global configuration      •   Backup directory      •   Number of generations t...
Layout of Archive Files• <backup dir>  – AllVM profile• <backup dir>/<vm>/  – VM profile• <backup dir>/<vm>/<generation>/ ...
Profiles• Allvm   – Information/status of all VMs in the target vSphere environment   – Updated by update command• Vm   – ...
Software Architecture          Cron                             User                  Command-line Interface              ...
Required Tools and Libraries• Java SE 1.6   – Java, Javac, Jar comands• VI-Java 2.1GA   – soap wrapper• G++ 4.4• Boost 1.4...
Source Code Overview (Java)• control/*                       • config/*   – Command-line I/F                – Config/profi...
VmdkBkp (C++ code)
What is VmdkBkp?• Online backup software  for remote/local vmdk files  in VMware vSphere environments.  – Currently suppor...
Archive Files• Dump/Rdiff  – VMDK metadata and blocks archive    without zero-blocks  – Dump is full archive,    Rdiff is ...
Supported Commands• Dump   – Execute full/differential/incremental dump• Restore   – Execute restore with dump/rdiff• Chec...
How to Backup Remote Vmdk• Command line:  – vmdkbkp dump [connect options] --mode [full/diff/incr]    --vm [vm moref] --sn...
Full Backup     VM             Virtual DiskConfiguration         (vmdk)           • Ovf                       All blocks  ...
Differential Backup     VM            Virtual Disk       • RdiffConfiguration        (vmdk)                               ...
Incremental Backup     VM            Virtual Disk       Changed BlockConfiguration        (vmdk)            Information   ...
Vmdk Archives Relationships                          Write some data on the 1st vm.          0.vmdk                       ...
Vmdk Archives Relationships –cont.               Write some data on the 1st vm.    0.vmdk                                 ...
Software Architecture of vmdkbkp                           Command                      Command executor              Util...
VDDK Control with Fork• Solves the problem that VDDK re-initialization  for SAN transfer due to SCSI reservation  conflict...
VDDK Control with Fork –cont.Main process                                              Provide the same interface         ...
Multi-threaded Archive Manager• Improves performance with gziped multi-  stream dump/restore/check/merge  operations      ...
Restore/Check with MultiArchiveManager  Archive Manager       Full dump       Rdiff       Rdiff                           ...
Restore with SAN• Problem in restore with SAN  – Failed auto-allocation for thin vmdk  – Auto-allocation is too slow for t...
Future Work• Improve parallelism  – Solving SCSI reservation conflict problem  – Multi-threaded compression• Restore with ...
Upcoming SlideShare
Loading in...5
×

Vmbkp: VMware vSphere Incremental Backup Tool

3,805

Published on

Source code repository.
https://github.com/starpos/vmbkp

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,805
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
31
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Vmbkp: VMware vSphere Incremental Backup Tool"

  1. 1. Vmbkp: An Online Backup Tool for VMware vSphere Oct 15, 2010 HOSHINO Takashi Cybozu Labs, Inc. 1
  2. 2. What is Vmbkp?• Backup software for Virtual Machines in VMware vSphere environment – Online full/differential/incremental backup – Multi-generation backup management – Efficient archive access with sequential IO and reverse diff. – Command-line I/F for scheduling by Cron 2
  3. 3. Supported platform• VMware vSphere 4 – vCenter server managing several ESX(i)s – Single ESX(i) (not tested) – Free ESXi is not supported (snapshot fails)• Backup server – Linux on x86_64 host. – CentOS 5.5 64bit is confirmed 3
  4. 4. Hardware Architecture Control/GetInfo with vSphere Soap Protocol VMware vSphere vCenter Server Vmbkp LAN Server VMware VMware ESX(i) Host ESX(i) Host SAN Data Transfer via SAN with VDDK Protocol Backup VM VM VM Storage Storage Storage StorageYou can use NBD transfer without SAN. 4
  5. 5. Commands• Update: – Get and save information of all available VMs• Backup: – Execute backup of the specified vm/group or all• Restore: – Execute restore of the specified archived generation as a new VM• Check: – Check backup archives are valid• Status: – Show status of backup archives 5
  6. 6. Commands –cont.• Destroy: – Remove a virtual machine from vSphere environment• Clean: – Delete archives of virtual machines• List: – Get a list of virtual machines satisfying specified conditions• Help: – Show usage 6
  7. 7. Workflow Backup Restore Prepare config Prepare config (Register to cron) Read config/profiles Read config/profilesGet vSphere information Restore target VMs Backup target VMs Import ovfExport ovf (without disks) Add disks to new VM Create snapshot Restore vmdk files (Get changed block info) Backup vmdk files Delete snapshot (Delete previous dump) User task Update profiles Vmbkp task 7
  8. 8. Configuration files• Global (required) – Global configuration • Backup directory • Number of generations to keep • Vmdkbkp path to backup/restore vmdk files • vSphere authentication information• Group (optional) – Group configuration for convenient use 8
  9. 9. Layout of Archive Files• <backup dir> – AllVM profile• <backup dir>/<vm>/ – VM profile• <backup dir>/<vm>/<generation>/ – Generation profile – Ovf file for VM configuration – Dump/digest/rdiff/bmp files for each vmdk 9
  10. 10. Profiles• Allvm – Information/status of all VMs in the target vSphere environment – Updated by update command• Vm – Information/status of archives of a VM – Created/updated by backup command and referred by restore command• Generation – Information/status of each generation of backup of a VM – Created by backup command and referred by restore command 10
  11. 11. Software Architecture Cron User Command-line Interface Backup/Restore ControllerUtility Soap Wrapper Vmdkbkp WrapperLibrary Snapshot Vmdkbkp: Vmdk Ovf Backup/Restore Bitmap Changed blocks XML (Ovf) Tool/Library (C++)Config/Profile VI Java Library VDDK C LibraryVMware vSphere VMware ESX(i) SAN vCenter Server Host Storage 11
  12. 12. Required Tools and Libraries• Java SE 1.6 – Java, Javac, Jar comands• VI-Java 2.1GA – soap wrapper• G++ 4.4• Boost 1.43 – shared_ptr, scoped_array, thread, and iostreams• VDDK 1.2.0 – Virtual disk development kit by Vmware 12
  13. 13. Source Code Overview (Java)• control/* • config/* – Command-line I/F – Config/profile parser and – Backup/restore Controller accessor – Vmdkbkp wrapper • profile/*• soap/* – Semantic-level config/profile – Soap (VI-Java) wrapper managers• utility/* – Utilities for Ovf, Bitmap, Command line, etc. 13
  14. 14. VmdkBkp (C++ code)
  15. 15. What is VmdkBkp?• Online backup software for remote/local vmdk files in VMware vSphere environments. – Currently support vSphere version 4.• Written in C++• Uses VDDK Library by Vmware• Used by Vmbkp (java) tool
  16. 16. Archive Files• Dump/Rdiff – VMDK metadata and blocks archive without zero-blocks – Dump is full archive, Rdiff is reverse differential one – Dump + Rdiff = Previous dump• Digest – MD5 digest data for all blocks of VMDK – Used to check equality of blocks, and validate corresponding dump/rdiff files
  17. 17. Supported Commands• Dump – Execute full/differential/incremental dump• Restore – Execute restore with dump/rdiff• Check – Validate dump/rdiff with digest data• Print – Print dump/rdiff/digest for human read• Digest – Make digest from dump• Merge – Make past dump from current dump and past rdiff(s)
  18. 18. How to Backup Remote Vmdk• Command line: – vmdkbkp dump [connect options] --mode [full/diff/incr] --vm [vm moref] --snapshot [snapshot moref] --remote [disk path] --dumpin [previous dump] --dumpout [current dump] --digestin [previous digest] --digestout [current digest] --bmpin [changed block bitmap] --rdiffout [current-previous rdiff]• Inputs/Outputs: – Full: Just --dumpout and --digestout are required – Diff: All options except --bmpin are required – Incr: All options are required
  19. 19. Full Backup VM Virtual DiskConfiguration (vmdk) • Ovf All blocks – VM configuration data (without disk information) Vmbkp Tool • Dump Non-zero blocks – Full data of vmdk (without zero-blocks) Backup files • Digest Dump – Digest data of all blocks Ovf Digest 19
  20. 20. Differential Backup VM Virtual Disk • RdiffConfiguration (vmdk) – Reverse difference All blocks data of vmdk – Dump’ + Rdiff’ = Dump Vmbkp Tool • You can delete dump of previous generation after current backup Non-zero blocks Backup files of Backup files ofprevious generation current generation Dump Dump’ Rdiff’ Ovf Digest Ovf’ Digest’ 20
  21. 21. Incremental Backup VM Virtual Disk Changed BlockConfiguration (vmdk) Information Changed blocks • Changed Block Information Vmbkp Tool – The set of address of changed blocks after previous backup Non-zero blocks Backup files of Backup files ofprevious generation current generation Dump Dump’ Rdiff’ Ovf Digest Ovf’ Digest’ 21
  22. 22. Vmdk Archives Relationships Write some data on the 1st vm. 0.vmdk 1.vmdk Full Full dump dump Diff dump 0.dump 1.dump 0.digest 1.digest Incr dump 1-0.rdiffCheck the all dump/digest files rdiff2bmpfrom all possible paths are the sameusing check_dump_and_dump and 1.bitmapcheck_digest_and_digest.
  23. 23. Vmdk Archives Relationships –cont. Write some data on the 1st vm. 0.vmdk 1.vmdk Restore Restore Merge 0.dump 1.dump 0.digest 1.digest Restore to 0.dump 1-0.rdiff Digest  Full dump 0.vmdk to 0r.dump  Check 0.dump and 0r.dump are the same. Merge 1.dump and 1-0.rdiff to 0m.dump  Digest 0m.dump to 0m.digest  Check 0.{dump,digest} and 0m.{dump.digest} are the same.
  24. 24. Software Architecture of vmdkbkp Command Command executor Util Header Manager Specific components Exception Serialize Bitmap General components• Command • Manager – Parse command-line and execute it – Manage (1) VDDK connection,• Util (2) vmdk file access, and (3) dump/rdiff/digest file access – Configuration, Time, etc. • Serialize• Header – StringMap/Integers data serializer – Manage header/blocks of dump/rdiff/digest files • Bitmap• Exception – Bitmap data serializer – Exceptions and related macros.
  25. 25. VDDK Control with Fork• Solves the problem that VDDK re-initialization for SAN transfer due to SCSI reservation conflict error inevitably fails and falls back to NBD transfer. 25
  26. 26. VDDK Control with Fork –cont.Main process Provide the same interface VddkController with Vddk/Vmdk Manager VddkWorker(parent) Manage processes and communicate with childForked process Wrapper of Vddk/Vmdk VddkWorker(child) manager and communicate with parent VddkManager VmdkManager
  27. 27. Multi-threaded Archive Manager• Improves performance with gziped multi- stream dump/restore/check/merge operations Archive Managers Interface of archive accesses specialized for each command Archive IO Managers Multi-threaded/Single-threaded stream access for each archive file DataReader, DataWriter Worker thread and its controller for Gzip compresson/decompression Queue Thread-safe FIFO 27
  28. 28. Restore/Check with MultiArchiveManager Archive Manager Full dump Rdiff Rdiff waiting waiting Full dumpMulti Archive Manager Rdiff Rdiff
  29. 29. Restore with SAN• Problem in restore with SAN – Failed auto-allocation for thin vmdk – Auto-allocation is too slow for thick vmdk – There is no efficient allocation API.• If zero-block restore with NBD is faster, use it as allocation method –  not fast…
  30. 30. Future Work• Improve parallelism – Solving SCSI reservation conflict problem – Multi-threaded compression• Restore with SAN – Depends on VDDK’s efficient block allocation API 30
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×