• Like
  • Save

HBaseCon 2013: Apache HBase Table Snapshots

  • 3,323 views
Uploaded on

Presented by: Jonathan Hsieh (Cloudera), Matteo Bertozzi (Cloudera), and Jesse Yates (Salesforce.com)

Presented by: Jonathan Hsieh (Cloudera), Matteo Bertozzi (Cloudera), and Jesse Yates (Salesforce.com)

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,323
On Slideshare
0
From Embeds
0
Number of Embeds
5

Actions

Shares
Downloads
0
Comments
0
Likes
15

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Talk about everything! Don’t glaze over internals
  • Why does it cause extra latency? What “crushes” the cluster?
  • Causes move expensive backups. Have a bunch of ‘write optimized files’ – HLogs and have to convert them to ‘read optimized files’ – HFIles. This isn’t a cheap process.
  • Don’t over sell! Just say what it is.
  • Add a quick summary of what just talked about BEFORE handoff!!!

Transcript

  • 1. Apache HBase Table Snapshots Matteo Bertozzi | Cloudera | Software Engineer / HBase Committer Jonathan Hsieh | Cloudera | Software Engineer / HBase Committer Jesse Yates | Salesforce.com | Software Engineer / HBase Committer June 13, 2013 HBaseCon 2013
  • 2. Outline • Intro and Use Cases • Usage Instructions • Internals • Snapshot Layout • Snapshot Restoration • Online Snapshots • Conclusion
  • 3. HBase Table Snapshots Snapshot is a collection of metadata required to reconstitute the data near a particular point in time HBaseCon 2013 6/13/20133
  • 4. HBase Table Snapshots • An inexpensive way to freeze state of a table • A mechanism that helps backup data to in the cluster or to a remote cluster • Recover from user error • Bootstrap Replication HBaseCon 2013 6/13/20134
  • 5. Old: HBase-Supported Batch Backups • Export / Dist CP / Import • 3 batch MR jobs • Several extra copies of data • High latency (hours) • Impacts existing low-latency workloads • Copy Table • 1 MR Job • Single copy of data • Incremental table copies • High Latency (hours) • Impacts existing workloads Export MR Job Import MR Job Dist CP MR Job Copy Table MR Job HBaseCon 2013 6/13/20135
  • 6. Upcoming: HDFS Snapshots (or DistCP backup) • Take an hdfs snapshot of all the files in the underlying HBase’s data directory. • Hfiles, hlogs, and other metadata. • Snapshots all tables in Hbase • Cannot Clone tables • “Restore As” • Targeted for Hadoop 2.1 / Hadoop 3.0 DistCP HLog Append Flush Compact Restart Recover HBaseCon 2013 6/13/20136
  • 7. New: HBase Snapshot-based Backups • Snapshot, then Export • 1 MR Job • Single copy of data • Little impact on low-latency workloads • Export is like distcp directly from hfds • No incremental snapshot copy HBaseCon 2013 Export Snapshot 6/13/20137
  • 8. Export • Like distcp for a snapshot manifest • Copy data files without going through HBase’s “front door” Export HBaseCon 2013 6/13/20138
  • 9. Recover from User Error • How do we recover from user error? Recovery Time time User Error: drop ‘table’ Service is restored, major data loss Service is down! Panic! Black magic! HBaseCon 2013 6/13/20139
  • 10. Recovering from User Mistakes: Table Snapshots • Snapshot the state of a table at a certain moment in time • Restore it or Clone it later, creating a new read write table • Export it to another cluster with minimal impact on HBase time User Error: drop ‘table’ Service restored, Minor data loss. Carry on. Periodic snapshot Service is down! Keep calm! restore Periodic snapshot HBaseCon 2013 6/13/201310
  • 11. Usage What an Admin needs to know
  • 12. Configuration • Simple hbase-site.xml configuration <property> <name>hbase.snapshot.enabled</name> <value>true</value> </property> • Enabled by default in 0.95+ • Requires user to enable in 0.94.6.1+. HBaseCon 2013 6/13/201312
  • 13. Usage: Shell Commands • snapshot ‘table’, ‘snapshot’ • Table can be offline or online • list_snapshot [<regex>] • clone_snapshot ‘snapshot’, ‘dsttable’ • restore_snapshot ‘snapshot’ • delete_snapshot ‘snapshot’ HBaseCon 2013 6/13/201313
  • 14. Usage: Web UI HBaseCon 2013 6/13/201314
  • 15. Usage: Web UI HBaseCon 2013 6/13/201315
  • 16. Export: Usage • Copy “MySnapshot” to a remote HDFS • $ hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase -mappers 16 • With permission change on the copy • $ hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase -chuser MyUser -chgroup MyGroup -chmod 700 -mappers 16 HBaseCon 2013 6/13/201316
  • 17. Debugging and Info • Dump a snapshot manifest • Writes to standard out • Usage • $ hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -snapshot test-snapshot • $ hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -snapshot test-snapshot -files HBaseCon 2013 6/13/201317
  • 18. Metrics • Histograms of operation completion • snapshotTime • cloneTime • restoreTime • Includes ‘extended’ metrics • Std deviation • Min/max HBaseCon 2013 6/13/201318
  • 19. Table Snapshot Internals
  • 20. Internals • HBase Table HDFS Layout • Snapshot HDFS layout • Offline Snapshots • Restore and Clone Snapshot • Online Snapshots HBaseCon 2013 6/13/201320
  • 21. Primer: HBase Table Layout in HDFS • HRegions map directly to a directory structure with table name, encoded region name, column family and hfiles. • In HDFS: /hbase/Table/<enc R1>/cf/<hfile f11> /hbase/Table/<enc R1>/cf/<hfile f12> /hbase/Table/<enc R2>/cf/<hfile f21> /hbase/Table/<enc R2>/cf/<hfile f22> /hbase/Table/<enc R3>/cf/<hfile f31> /hbase/Table/<enc R3>/cf/<hfile f32> Table F11 F21 F31 R1 R2 R3 6/13/2013HBaseCon 201321
  • 22. Table Snapshots in the File System • A Snapshot manifest contains references to files in the original table. ./.hbase-snapshots Table F11 F21 F31 R1 R2 R3 TableSnapshot manifest R1 R2 R3 HBaseCon 2013 6/13/201322
  • 23. Table Snapshots in the File System • A Snapshot manifest contains references to files in the original table. • Each snapshot is stored in the hbase/.hbase-snapshots dir. ./.hbase-snapshots Table F11 F21 F31 R1 R2 R3 TableSnapshot manifest R1 R2 R3 HBaseCon 2013 6/13/201323
  • 24. Offline Snapshots • Disable table, then create Snapshot Manifest • Created in temporary dir to guarantee snapshot creation atomicity • Includes • Snapshot Metadata • Table Metadata/Schema (.tableinfo) • References to original HFiles • Master-only file system operation HBaseCon 2013 6/13/201324
  • 25. HFile Life Cycle • Splits and Compactions remove hfiles • What happens to references to these files? ./.hbase-snapshots Table F11 F21 F31 R1 R2 R3 TableSnapshot manifest R1 R2 R3 HBaseCon 2013 6/13/201325
  • 26. HFile Life Cycle • Splits and Compactions remove hfiles • What happens to references to these files? ./.hbase-snapshots Table F11 F21 F31 R1 R2 R3 TableSnapshot manifest R1 R2 R3 HBaseCon 2013 6/13/201326
  • 27. HFile Life Cycle • Splits and Compactions remove hfiles • What happens to references to these files? ./.hbase-snapshots Table F11 F21 F31 +32 R1 R2 R3 TableSnapshot manifest R1 R2 R3 No more Hfile?? HBaseCon 2013 6/13/201327
  • 28. HFile Archiver ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 Table files F31 HBaseCon 2013 6/13/201328 • We archive old HFiles from compactions (HBASE-5547)
  • 29. HFile Archiver ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 F31 +32 Table files F31 HBaseCon 2013 6/13/201329 • We archive old HFiles from compactions (HBASE-5547) • Files stored in hbase/.archive
  • 30. HFile Archiver ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 F31 +32 Table files F31 • We archive old HFiles from compactions (HBASE-5547) • Files stored in hbase/.archive • HFileCleaner ensures HFiles’ data remains available HBaseCon 2013 6/13/201330
  • 31. Restore Snapshot Internals
  • 32. Restore Operations • Restore table • Rollback table to specific state • Clone from snapshot (Restore As) • Create new read-write table from snapshot • There can be multiple replicas of a snapshot • Export snapshot • Send snapshot and all its data to another cluster HBaseCon 2013 6/13/201332
  • 33. Clone: Creating table from a Snapshot • Convert snapshot manifest info into a Table. ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 Table files Clone R1 R2 R3 F31 HBaseCon 2013 6/13/201333
  • 34. Clone: Creating table from a Snapshot • Convert snapshot manifest info into a Table. ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 Table files Clone R1 R2 R3 F31 HBaseCon 2013 6/13/201334
  • 35. Clone: Creating table from a Snapshot ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 F31 +32 Table files F31 Clone R1 R2 R3 • Convert snapshot manifest info into a Table. • HFileLinks (HBASE-6610) to mimic unix open file descriptor semantics HBaseCon 2013 6/13/201335
  • 36. Restore: Rollback to an old state • Rollback the existing table to snapshot state • Restores original schema if altered • Snapshots current table, just in case • Minimal overhead • Smarter delete table & clone snapshot • Handles creating/deleting regions • Restore META HBaseCon 2013 6/13/201336
  • 37. Restore illustrated ./.hbase-snapshots ./.archive TableSnapshot manifest R1 R2 R3 Table files F31 Table F11 F21 R1 R2 R3 F31 +32 F41 R4 • Rollback “Table” to the “TableSnapshot” state HBaseCon 2013 6/13/201337
  • 38. Restore illustrated ./.hbase-snapshots ./.archive TableSnapshot manifest R1 R2 R3 Table files F31 Table F11 F21 R1 R2 R3 F31 +32 F41 R4 • Region “R4” is not present in the snapshot • “R4” will be removed from “Table”, files moved to “.archive” HBaseCon 2013 6/13/201338
  • 39. Restore illustrated ./.hbase-snapshots ./.archive TableSnapshot manifest R1 R2 R3 Table files F31 Table F11 F21 R1 R2 R3 F31 +32 F41 • New files not present in the snapshots are moved to the archive HBaseCon 2013 6/13/201339
  • 40. Restore illustrated ./.hbase-snapshots ./.archive TableSnapshot manifest R1 R2 R3 Table files F31 Table F11 F21 R1 R2 R3 F41F3+ 32 • New files not present in the snapshots are moved to the archive • HFileLinks are created to point to old files. HBaseCon 2013 6/13/201340
  • 41. Restore failures • The table to restore is disabled • META and HDFS operations may fail (network issue, server down, …) • hbck can’t repair an incomplete restore... • Restore again! HBaseCon 2013 6/13/201341
  • 42. Export Snapshot • Copy a full snapshot to another cluster • All required HFiles, and Metadata • Lots of options • Fancy dist-cp • Must resolve HFileLinks • Faster than CopyTable or table export+import! • Minimal impact on running cluster HBaseCon 2013 6/13/201342
  • 43. Online Snapshots
  • 44. Online snapshots • Take a snapshot without making the table unavailable • No need to disable the table • Continue accepting reads and writes from clients • Challenges • Coordinating Region Servers • Data is in memory • Consistency HBaseCon 2013 6/13/201344
  • 45. Offline vs Online Snapshots Offline Online mastermaster RS1 RS2 RS3 RS4 verify Snapshot region subprocedure Write manifest per region verify Write manifest per region 6/13/2013HBaseCon 201345
  • 46. Online Snapshots • Each Region can have data in memstore and hlog, not yet Hfile • Snapshot is missing in memory data! ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 TableSnapshot manifest R1 R2 R3 Table files F31 mem mem mem HBaseCon 2013 6/13/201346
  • 47. Online Snapshots • Flush so that all in memory data written in an Hfile • Then add to snapshot manifest ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 Table files F31 F13 F23 F33 TableSnapshot manifest R1 R2 R3 HBaseCon 2013 6/13/201347
  • 48. Online Snapshots • Flush so that all in memory data in an Hfile • Then add to snapshot manifest ./.hbase-snapshots ./.archive Table F11 F21 R1 R2 R3 Table files F31 F13 F23 F33 TableSnapshot manifest R1 R2 R3 HBaseCon 2013 6/13/201348
  • 49. Consistency • Offline Snapshots • Fully consistent snapshot • Online Flush Snapshot • “CopyTable” level consistency with a much smaller window. • Time bounded by slowest region server and region flush HBaseCon 2013 6/13/201349
  • 50. Online Snapshots and Causal consistency • Causal consistency would only allow A, AB, or neither A nor B. • B and Not A is currently possible Table F11 F21 R1 R2 R3 F31 TableSnapshot manifest R1 R2 R3 Master RS1 RS2 Client mem mem Flush SS F13 HBaseCon 2013 6/13/201350
  • 51. Online Snapshots and Causal consistency • Causal consistency would only allow A, AB, or neither A nor B. • B and Not A is currently possible Table F11 F21 R1 R2 R3 F31 TableSnapshot manifest R1 R2 R3 Master RS1 RS2 Client mem mem Put A … … then Put B F13 mem HBaseCon 2013 6/13/201351
  • 52. Online Snapshots and Causal consistency • Causal consistency would allow A, AB, or neither A nor B. • B and Not A is possible with Flush Snapshots Table F11 F21 R1 R2 R3 F31 TableSnapshot manifest R1 R2 R3 Master RS1 RS2 Client mem F23 F13 Flush SS Put B is in but Put A is not! F33 HBaseCon 2013 6/13/201352
  • 53. Online snapshot attempts can fail • If involved RS’s fail, the snapshots attempt will fail. • Needs a way to prevent other table metadata operations • Table Metadata Locks (0.95+) • Avoid many snapshot failures conflicts(Ex: Online schema, splits) • Failed attempt will report errors -- user must retry. • o.a.h.hbase.snapshot.HBaseSnapshotException • o.a.h.hbase.snapshot.CorruptedSnapshotException HBaseCon 2013 6/13/201353
  • 54. Development Notes How we collaborated, built, and tested
  • 55. Table Snapshots Development • Developed in a Branch off of trunk merged and in 0.95 and trunk. • Feature is too big to include as a single patch • Does not destabilize trunk • Does not slow time-based release trains • Later Backported to 0.94.6.1 src branch Reintegrate into trunk sync HBaseCon 2013 6/13/201355
  • 56. System testing with Jenkins • Concurrently load data while taking snapshots • Inject compactions, Kill RS’s, Meta RS, Master • Create snapshot clones of the snapshots • Inject Compactions, Kill RS’s, META Rs, Master HBaseCon 2013 6/13/201356
  • 57. Future Work: • Alternative semantics and implementations • Log Roll Snapshot (HBASE-7291) • Store logs and replay on restore • Faster for snapshot, slower and more complicated for restore. • Timestamp Snapshot (HBASE-6866) • All updates before ts in snapshot, all after not in snapshot • Longer pause before snapshot taken • Globally-Consistent Snapshot (HBASE-6867) • global write lock for all regions nodes until snapshot complete. • Expensive • Repair tools • Manual repairs necessary (hbck does not support yet) HBaseCon 2013 6/13/201357
  • 58. Conclusions
  • 59. Feature Summary by Version Apache 0.92.x Apache <0.94.6.1 Apache 0.94.6.1+ Apache 0.95.0 Apache 0.96.0 Copy Table Copy Table Copy Table Import / Export Import / Export Import / Export Offline snapshots Offline snapshots Flush Online Snapshot Flush Online Snapshot Table Locks HBaseCon 2013 6/13/201359
  • 60. Key Contributors • Jesse Yates (Salesforce.com) • HFileArchiver, Offline Snapshot, first draft online • Matteo Bertozzi (Cloudera) • HFileLink, Restore, clone, Testing, 0.94 backport • Jonathan Hsieh (Cloudera) • Online Snapshots revamp, Testing, Branch Sheppard • Ted Yu (HortonWorks) • Reviews • Enis Soztutar (HortonWorks) • Table Locks on Snapshots HBaseCon 2013 6/13/201360
  • 61. Thanks! Questions? Matteo Bertozzi @th30z matteo.bertozzi@cloudera.com Jonathan Hsieh @jmhsieh jon@cloudera.com Jesse Yates @jesse_yates jesse.k.yates@gmail.com HBaseCon 2013 6/13/201361