• Like
  • Save
HBase Incremental Backup
Upcoming SlideShare
Loading in...5
×
 

HBase Incremental Backup

on

  • 1,632 views

 

Statistics

Views

Total Views
1,632
Views on SlideShare
1,632
Embed Views
0

Actions

Likes
1
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    HBase Incremental Backup HBase Incremental Backup Presentation Transcript

    • HBase Incremental Backup / Restore2012/07/23
    • How to perform Incremental Backup/Restore?• HBase ships with a handful of useful tools – CopyTable – Export / Import
    • CopyTable• Purpose: – Copy part of or all of a table, either to the same cluster or another cluster• Usage: – bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [-- endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename• Options: – starttime: Beginning of the time range. – endtime: End of the time range. Without endtime means starttime to forever. – new.name: New tables name. – peer.adr: Address of the peer cluster given in the format hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeepe r.znode.parent – families: Comma-separated list of ColumnFamilies to copy.
    • CopyTable (cont.)• Limitation – Can only backup to another table (Scan + Put) – While a CopyTable is running, newly inserted or updated rows may occur and these concurrent edits may cause inconsistency.
    • Export• Purpose: – Dump the contents of table to HDFS in a sequence file• Usage: – $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [[<starttime> [<endtime>]]]• Options: – *tablename: The name of the table to export – *outputdir: The location in HDFS to store the exported data – starttime: Beginning of the time range – endtime: The matching end time for the time range of the scan used
    • Export (cont.)• Limitation – Can only backup to HDFS in a sequence file (Scan + Write to HDFS). – While a CopyTable is running, newly inserted or updated rows may occur and these concurrent edits may cause inconsistency.
    • Import• Purpose: – Load data that has been exported back into HBase• Usage – $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
    • Conclusion• Regular (ex. Daily) Incremental backup – Use Export and organize output dir as a meaningful hierarchy • /table_name /2012 (year) /07 (month) /01 (date) /02 … /31 /01 (hour) … /24 – Perform Import to restore data on-demand• To reduce the overhead, don’t perform it during the peak time
    • Question?