• Like
HBase Incremental Backup
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

HBase Incremental Backup

  • 1,663 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,663
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
10
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. HBase Incremental Backup / Restore2012/07/23
  • 2. How to perform Incremental Backup/Restore?• HBase ships with a handful of useful tools – CopyTable – Export / Import
  • 3. CopyTable• Purpose: – Copy part of or all of a table, either to the same cluster or another cluster• Usage: – bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [-- endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename• Options: – starttime: Beginning of the time range. – endtime: End of the time range. Without endtime means starttime to forever. – new.name: New tables name. – peer.adr: Address of the peer cluster given in the format hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeepe r.znode.parent – families: Comma-separated list of ColumnFamilies to copy.
  • 4. CopyTable (cont.)• Limitation – Can only backup to another table (Scan + Put) – While a CopyTable is running, newly inserted or updated rows may occur and these concurrent edits may cause inconsistency.
  • 5. Export• Purpose: – Dump the contents of table to HDFS in a sequence file• Usage: – $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [[<starttime> [<endtime>]]]• Options: – *tablename: The name of the table to export – *outputdir: The location in HDFS to store the exported data – starttime: Beginning of the time range – endtime: The matching end time for the time range of the scan used
  • 6. Export (cont.)• Limitation – Can only backup to HDFS in a sequence file (Scan + Write to HDFS). – While a CopyTable is running, newly inserted or updated rows may occur and these concurrent edits may cause inconsistency.
  • 7. Import• Purpose: – Load data that has been exported back into HBase• Usage – $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
  • 8. Conclusion• Regular (ex. Daily) Incremental backup – Use Export and organize output dir as a meaningful hierarchy • /table_name /2012 (year) /07 (month) /01 (date) /02 … /31 /01 (hour) … /24 – Perform Import to restore data on-demand• To reduce the overhead, don’t perform it during the peak time
  • 9. Question?