SlideShare a Scribd company logo
HBase Backups
Backups in the Enterprise
Jesse Yates Demai Ni
Jing Chen He
Richard Ding
1 HBase Backups - HBaseCon 2014
Overview
• Commonalities
• IBM BigInsights
• Backups at Salesforce.com
• Summary
2 HBase Backups - HBaseCon 2014
Commonalities
• Per-Table Backups
• Stored On HDFS
• Full Backup + Incrementals
• Fast Restore
• Multiple Clusters
• Timestamp file layout
• Manifest Files for additional info
• Merging Backups
3 HBase Backups - HBaseCon 2014
IBM BigInsights
HBase Backups - HBaseCon 20144
Backup Solution - IBM
• Customer Requirements
• Feature Overview
• Technical Design
• User Interface: CLI and Web UI
• Data Structures
5 HBase Backups - HBaseCon 2014
Customer Requirements
• Backup and Restore
– Critical requirements from enterprise customers
– General solution
– Easy-to-use user interfaces: CLI and Web UI
– Multiple file systems: HDFS and GPFS*
– Multiple MR frameworks: Hadoop and PSMR*
6 HBase Backups - HBaseCon 2014
*GPFS: IBM General Parallel File System
*PSMR: Platform Symphony MapReduce
Feature Overview
• Full Backup based on HBase Snapshot
• Incremental Backup based on HBase transaction logs
• Table-level Incremental Backup
• Point-In-Time Restore
• On-the-fly and Off-line Convert from HLogs to HFiles
• Off-line Merge Backup Images
• Self-contained Backup Image with Manifest File
• Usability features:
– progress, status, and history reports
– purge old Backup Images
7 HBase Backups - HBaseCon 2014
Technical Design - Overview
• Object: Backup Image
• Operations:
– Full Backup
– Incremental Backup
– Convert
– Merge
– Restore
HBase Backups - HBaseCon 20148
Technical Design - Backup Images
Full Backup Table1
(Monday)
Full Backup Table2
(Tuesday)
Incremental Backup [Table1, Table2]
(Wednesday)
Incremental Backup [Table1, Table2]
(Thursday)
depends
depends
depends
HBase Backups - HBaseCon 20149
Technical Design - Full Backup
10 HBase Backups - HBaseCon 2014
$ hbase backup create full
hdfs://targetCluster.ibm.com:9000/hbasebackups
biginsights:hbasecon_table1
Global
Distributed
WAL Roll
Take
Snapshot
Track WAL
Timestamp
Through
Zookeeper
Export
Snapshot
Generate
Manifest
Technical Design - Incremental Backup
11 HBase Backups - HBaseCon 2014
$ hbase backup create incremental
hdfs://targetCluster.ibm.com:9000/hbasebackups
Global
Distributed
WAL Roll
Track WAL
Timestamp
Through
ZooKeeper
DistCp WAL
Logs into
Backup
Image
Generate
Manifest
Technical Design - Restore
12 HBase Backups - HBaseCon 2014
$ hbase restore
hdfs://targetCluster.ibm.com:9000/hbasebackups
biginsights:hbasecon_table1
biginsights:hbasecon_table1_restore
Create Table
Pre-Split
Using
Manifest Info
Bulk Load
HFiles
Full and
Incremental
Play WAL of
Unconverted
Hlogs
Verify
Lineage
and Restore
Technical Design - Convert
13 HBase Backups - HBaseCon 2014
$ hbase backup convert /hbasebackups backup_20140502_2100
full backup : backup_20140501_2100
Incremental backup backup_20140502_2100
/hbasebackups/biginsights/hbasecon_table1/
backup_20140501_2100/Metadata+HFiles
backup_20140502_2100/Metadata
/hbasebackups/biginsights/hbasecon_table2/
backup_20140501_2100/Metadata+HFiles
backup_20140502_2100/Metadata
/hbasebackups/WALs/
backup_20140502_2100/HLogs of ALL Tables
Befor
e
Technical Design - Convert
14 HBase Backups - HBaseCon 2014
$ hbase backup convert /hbasebackups backup_20140502_2100
full backup : backup_20140501_2100
Incremental backup backup_20140502_2100
/hbasebackups/biginsights/hbasecon_table1/
backup_20140501_2100/Metadata+HFiles
backup_20140502_2100/Metadata+HFiles
/hbasebackups/biginsights/hbasecon_table2/
backup_20140501_2100/Metadata+HFiles
backup_20140502_2100/Metadata+HFiles
/hbasebackups/WALs/
backup_20140502_2100/
After
Technical Design - Merge
15 HBase Backups - HBaseCon 2014
$ hbase backup merge /hbasebackups biginsights:hbasecon_table1
backup_20140501_2100 backup_20140502_2100
Full backup: backup_20140501_2100
Incremental backup: backup_20140502_2100
/hbasebackups/biginsights/hbasecon_table1/
backup_20140501_2100/
backup_20140502_2100/
/hbasebackups/biginsights/hbasecon_table1/
backup_20140502_2100/
TimeStamp 2
TimeStamp 1
TimeStamp 2
User Interface - CLI
$ hbase backup help
Usage: hbase backup COMMAND
where COMMAND is one of:
create create a new backup
cancel cancel an ongoing backup
delete delete an existing backup
describe show the detailed information of a backup
history show history of all successful backups
status show the status of the latest backup request
convert convert incremental backup WAL files into HFiles
merge merge backup images
stop remove table(s) from backup table set
show show table(s) in backup table set
Enter 'help COMMAND' to see help message for each command
16 HBase Backups - HBaseCon 2014
User Interface – Web UI Backup
17 HBase Backups - HBaseCon 2014
User Interface – Web UI Restore
18 HBase Backups - HBaseCon 2014
Data Structure - Backup Image
• Table Info and Region Info
• Backup Manifest
– Table Name
– Type: Full or Incremental
– Size
– Timestamp Info
– State Info: Converted, Merged, Compacted, etc.
– Dependency Lineage
• Data
– HFiles
– WALs (For Incremental Backup before convert)
19 HBase Backups - HBaseCon 2014
Data Structure - ZooKeeper/backup/hbase
startcode {backup marker}
complete/
backupId_1 {contains backup metadata}
……
backupId_n
ongoing {contains the progress status of the current operation}
failed {contains error code and message of the current operation}
cancel {triggers a cancel operation }
incr/
tablelogtimestamp/
table_1 {list of region servers and associated log timestamp for this table}
……
table_n
last-roll-log-ts/
rs_1 {contains the log timestamp from last roll log}
……
rs_n
20 HBase Backups - HBaseCon 2014
HBase Backups - HBaseCon 2014
Sincere gratitude is hereby extended to the following
developers who contributed to this effort:
Richard Ding, Jing Chen He, Enoch Hsu, Yu Li, Jihong Ma,
Demai Ni, Kan Zhang, Liping Zhang, Xiang Zhou
* ordered by last name
21
Salesforce.com Backups
HBase Backups - HBaseCon 2014
Jesse Yates
22
Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements
that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the
results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All
statements other than statements of historical fact could be deemed forward-looking, including any projections of subscriber growth,
earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations,
statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or
use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, our new business model, our past operating losses, possible fluctuations in our operating results and rate of
growth, interruptions or delays in our Web hosting, breach of our security measures, risks associated with possible mergers and acquisitions,
the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees
and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com
products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial
results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year ended January 31, 2011. This
document and others are available on the SEC Filings section of the Investor Information section of our Web site.
Any unreleased services or features referenced in this or other press releases or public statements are not currently available and may not be
delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently
available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.
23 HBase Backups - HBaseCon 2014
Safe Harbor
Salesforce Environment
• Many tenants per cluster
• At least 90 days of recovery
• DR failover to remote DC
• All writes through Phoenix
– Timestamp control
24 HBase Backups - HBaseCon 2014
Design Goals
• Validate backups regularly
• Minimize time to restore a tenant
• Validate replication is up to date
• Minimize data storage
25 HBase Backups - HBaseCon 2014
Backups
• M/R a table at a given point in time
– Point-in-time view of the table
• Chunked by file size + tenant (per server)
• Chunk manifest
– Chunk info (min/max/hash/tenant ids)
26 HBase Backups - HBaseCon 2014
Backups
27 HBase Backups - HBaseCon 2014
Key CF CQ TS Value
user1_a fam qual 14 value10
user1_a fam qual 12 Value5
user1_a fam qual 10 Valu2
user1_a fam qual 8 value4
user1_a fam qual 3 value13
user1_a fam qual 2 value56
1. http://phoenix.incubator.apache.org/
Backups
28 HBase Backups - HBaseCon 2014
Some HBase Table
M M M M M M M
Hadoop Distributed File System
Backups
• Each backup is an incremental
– Lineage by convention
• Never write too far back in time
• Data retained by custom coprocessor
– Retained up to last successful backup
29 HBase Backups - HBaseCon 2014
“Backup isn’t a backup until you’ve restored it
and tested it”
-- Some Ops Guy
30 HBase Backups - HBaseCon 2014
Restore + Validation
• Restore each backup to a new table
• Validate that backup has same data a existing
table
– Within backup timerange
• Move ‘retained timestamps’ forward
31 HBase Backups - HBaseCon 2014
Restore
32 HBase Backups - HBaseCon 2014
HDFS
/hbase
…
/salesforce
/backup
/somehbasetable
/03/14/14
backup.properties
chunk1
chunk1.manifest
….
chunk1000
chunk1000.manifest
M
M
M
SomeHBaseTable_Restore
Restore
• Configurable validation percent
– Start high, move lower
• Backup only valid if restore is successful
33 HBase Backups - HBaseCon 2014
34 HBase Backups - HBaseCon 2014
90 Days of Backup is
LOTS of Data
Even without any duplicates!
Granularity Reduction
• Combine backups every ‘period’
– Week, month, 3 months
– Specified in table metadata
• Keep latest version of the row
• Helpful with lots of updates
– Not useful for unique data (e.g. time series)
35 HBase Backups - HBaseCon 2014
Granularity Reduction
36 HBase Backups - HBaseCon 2014
HDFS
/salesforce
/backup
/somehbasetable
/03-14-14
/03-13-14
…
/03-07-14
/03-01_07-14
/02-23_28-14
/02-16_24-14
/02-09_15-14
/01-14
/12-13
/11-13
/base
M
M
M
HDFS
/salesforce
/03-07_14-14
/03-01_07-14
/02-14
/01-14
/12-13
/base
HDFS
Granularity Reduction
37 HBase Backups - HBaseCon 2014
HDFS
/salesforce
/backup
/somehbasetable
/03-14-14
/03-13-14
…
/03-07-14
/03-01_07-14
/02-23_28-14
/02-16_24-14
/02-09_15-14
/01-14
M
M
M
Weekly Merge
Monthly Merge
/salesforce
/03-07_14-14
/03-01_07-14
/02-14
/01-14
/12-13
/base
Rebuilt Base
38 HBase Backups - HBaseCon 2014
Meanwhile…
Remember that DR site?
Disaster Recovery
39 HBase Backups - HBaseCon 2014
Primary Data Center Buddy (DR) Data Center
Validation By Backup
• Validate replication is working
• Validate backup process consistent
• Validate granularity reduction consistent
40 HBase Backups - HBaseCon 2014
Validation By Backup
• Build up hash of hashes
– Two level Merkle Tree
• Check that both DCs have the same hash
– Can easily identify differences per-manifest
• Requires time-delay for backups
– <= replication delay
41 HBase Backups - HBaseCon 2014
Hash Validation
42 HBase Backups - HBaseCon 2014
Backup Manifest
• chunk size
• start time
• end time
• combined hash
• version
Chunk
Manifest
• key prefix
• stats
• hash
Chunk
Manifest
• key prefix
• stats
• hash
…
Primary Data Center
Backup Manifest
• chunk size
• start time
• end time
• combined hash
• version
Chunk
Manifest
• key prefix
• stats
• hash
Chunk
Manifest
• key prefix
• stats
• hash
…
Buddy Data Center
Mismatch!
Tracking Status
• Daily emails
• Progress stored in Phoenix Table
• Easy access for auditing
• Easy display for UI (coming soon)
43 HBase Backups - HBaseCon 2014
Future Work
• Extensive tooling around per-tenant restore
• M/R from snapshot
44 HBase Backups - HBaseCon 2014
Lessons Learned
• Track Properties
– Version, table, lineage, etc
• Fast Restore is Important
– Consider your business case
• Validation!
45 HBase Backups - HBaseCon 2014
Special Thanks
All the members of the Salesforce HBase team,
particularly:
Vasu Mariyala, Sukumar Maddineni, Alex Araujo, Lars
Hofhansl, Ian Varley, Santosh Rau
46 HBase Backups - HBaseCon 2014
Summary
• Per-Table Backups
• IBM
– WAL based
– Extra tooling for fast restores
– Extensive lineage tracking
• Salesforce
– M/R over HTable
– Multi-tenant
– Multiple Validation vectors
47 HBase Backups - HBaseCon 2014
48 HBase Backups - HBaseCon 2014
Thanks!
Questions?
Jesse Yates Demai Ni
Jing He Chen
Richard Ding

More Related Content

What's hot

HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBaseCon
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon
 
Data Evolution in HBase
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBase
HBaseCon
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
Cloudera, Inc.
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application Archetypes
HBaseCon
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
Cloudera, Inc.
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
DataWorks Summit/Hadoop Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
DataWorks Summit/Hadoop Summit
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
HBaseCon
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
Cloudera, Inc.
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
Matthew Blair
 
HBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBaseHBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBase
Cloudera, Inc.
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
DataWorks Summit
 
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
Suman Srinivasan
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
Cloudera, Inc.
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
HBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBase
HBaseCon
 

What's hot (20)

HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
 
Data Evolution in HBase
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBase
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application Archetypes
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
 
HBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBaseHBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBase
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
HBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBase
 

Viewers also liked

Developing Offline Mobile Apps with Salesforce Mobile SDK SmartStore
Developing Offline Mobile Apps with Salesforce Mobile SDK SmartStoreDeveloping Offline Mobile Apps with Salesforce Mobile SDK SmartStore
Developing Offline Mobile Apps with Salesforce Mobile SDK SmartStore
Tom Gersic
 
6 Reasons to Protect Your Salesforce Data
6 Reasons to Protect Your Salesforce Data6 Reasons to Protect Your Salesforce Data
6 Reasons to Protect Your Salesforce Data
Odaseva
 
Zero-Copy Event-Driven Servers with Netty
Zero-Copy Event-Driven Servers with NettyZero-Copy Event-Driven Servers with Netty
Zero-Copy Event-Driven Servers with Netty
Daniel Bimschas
 
Amebaにおけるログ解析基盤Patriotの活用事例
Amebaにおけるログ解析基盤Patriotの活用事例Amebaにおけるログ解析基盤Patriotの活用事例
Amebaにおけるログ解析基盤Patriotの活用事例
cyberagent
 
HBaseを用いたグラフDB「Hornet」の設計と運用
HBaseを用いたグラフDB「Hornet」の設計と運用HBaseを用いたグラフDB「Hornet」の設計と運用
HBaseを用いたグラフDB「Hornet」の設計と運用
Toshihiro Suzuki
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
Cloudera, Inc.
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
Cloudera, Inc.
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
Cloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
Cloudera, Inc.
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
Cloudera, Inc.
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
Cloudera, Inc.
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
Cloudera, Inc.
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
Cloudera, Inc.
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
Cloudera, Inc.
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
Cloudera, Inc.
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Cloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
Cloudera, Inc.
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
Cloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
Cloudera, Inc.
 

Viewers also liked (20)

Developing Offline Mobile Apps with Salesforce Mobile SDK SmartStore
Developing Offline Mobile Apps with Salesforce Mobile SDK SmartStoreDeveloping Offline Mobile Apps with Salesforce Mobile SDK SmartStore
Developing Offline Mobile Apps with Salesforce Mobile SDK SmartStore
 
6 Reasons to Protect Your Salesforce Data
6 Reasons to Protect Your Salesforce Data6 Reasons to Protect Your Salesforce Data
6 Reasons to Protect Your Salesforce Data
 
Zero-Copy Event-Driven Servers with Netty
Zero-Copy Event-Driven Servers with NettyZero-Copy Event-Driven Servers with Netty
Zero-Copy Event-Driven Servers with Netty
 
Amebaにおけるログ解析基盤Patriotの活用事例
Amebaにおけるログ解析基盤Patriotの活用事例Amebaにおけるログ解析基盤Patriotの活用事例
Amebaにおけるログ解析基盤Patriotの活用事例
 
HBaseを用いたグラフDB「Hornet」の設計と運用
HBaseを用いたグラフDB「Hornet」の設計と運用HBaseを用いたグラフDB「Hornet」の設計と運用
HBaseを用いたグラフDB「Hornet」の設計と運用
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 

Similar to HBase Backups

Hbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the EnterpriseHbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the Enterprise
Salesforce Engineering
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Operationalizing Big Data as a Service
Operationalizing Big Data as a ServiceOperationalizing Big Data as a Service
Operationalizing Big Data as a Service
Salesforce Engineering
 
ARUL MURUGAN SUBRAMANIAN
ARUL MURUGAN SUBRAMANIANARUL MURUGAN SUBRAMANIAN
ARUL MURUGAN SUBRAMANIAN
Arul Murugan Subramanian
 
SAP HANA SPS09 - Multitenant Database Containers
SAP HANA SPS09 - Multitenant Database ContainersSAP HANA SPS09 - Multitenant Database Containers
SAP HANA SPS09 - Multitenant Database Containers
SAP Technology
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Chris Nauroth
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
Chiou-Nan Chen
 
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
JAX London
 
Cloud centric consumption based services for SAP, HANA, Concur, Ariba, C4C
Cloud centric consumption based services for SAP, HANA, Concur, Ariba, C4CCloud centric consumption based services for SAP, HANA, Concur, Ariba, C4C
Cloud centric consumption based services for SAP, HANA, Concur, Ariba, C4C
Ajay Kumar Uppal
 
A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...
Xoomworks Business Intelligence
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
HBaseCon
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Rajit Saha
 
HBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - OperationsHBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - Operations
phanleson
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?
DataWorks Summit
 
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
Salesforce Engineering
 
Ric bradley resume 2016
Ric bradley resume 2016Ric bradley resume 2016
Ric bradley resume 2016
Ric Bradley
 
Profile narendraredy
Profile narendraredyProfile narendraredy
Profile narendraredy
NARENDRA REDDY S
 
Storage strategy and tsm roadmap
Storage strategy and tsm roadmapStorage strategy and tsm roadmap
Storage strategy and tsm roadmap
IBM Danmark
 
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against DisastersProtecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against Disasters
DataWorks Summit
 
project proposal guidelines for bw on hana Dr Erdas
project proposal guidelines for bw on hana Dr Erdasproject proposal guidelines for bw on hana Dr Erdas
project proposal guidelines for bw on hana Dr Erdas
Prof Dr Mehmed ERDAS
 

Similar to HBase Backups (20)

Hbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the EnterpriseHbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the Enterprise
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Operationalizing Big Data as a Service
Operationalizing Big Data as a ServiceOperationalizing Big Data as a Service
Operationalizing Big Data as a Service
 
ARUL MURUGAN SUBRAMANIAN
ARUL MURUGAN SUBRAMANIANARUL MURUGAN SUBRAMANIAN
ARUL MURUGAN SUBRAMANIAN
 
SAP HANA SPS09 - Multitenant Database Containers
SAP HANA SPS09 - Multitenant Database ContainersSAP HANA SPS09 - Multitenant Database Containers
SAP HANA SPS09 - Multitenant Database Containers
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
Java Tech & Tools | Deploying Java & Play Framework Apps to the Cloud | Sande...
 
Cloud centric consumption based services for SAP, HANA, Concur, Ariba, C4C
Cloud centric consumption based services for SAP, HANA, Concur, Ariba, C4CCloud centric consumption based services for SAP, HANA, Concur, Ariba, C4C
Cloud centric consumption based services for SAP, HANA, Concur, Ariba, C4C
 
A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
HBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - OperationsHBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - Operations
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?
 
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
 
Ric bradley resume 2016
Ric bradley resume 2016Ric bradley resume 2016
Ric bradley resume 2016
 
Profile narendraredy
Profile narendraredyProfile narendraredy
Profile narendraredy
 
Storage strategy and tsm roadmap
Storage strategy and tsm roadmapStorage strategy and tsm roadmap
Storage strategy and tsm roadmap
 
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against DisastersProtecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against Disasters
 
project proposal guidelines for bw on hana Dr Erdas
project proposal guidelines for bw on hana Dr Erdasproject proposal guidelines for bw on hana Dr Erdas
project proposal guidelines for bw on hana Dr Erdas
 

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
HBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
HBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
HBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
HBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
HBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
HBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
HBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
HBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
HBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
HBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
HBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon
 

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Recently uploaded

Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
Alina Yurenko
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
safelyiotech
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
Jhone kinadey
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in NashikUpturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
kalichargn70th171
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and MoreManyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
narinav14
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio, Inc.
 
Optimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptxOptimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptx
WebConnect Pvt Ltd
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
sandeepmenon62
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
widenerjobeyrl638
 

Recently uploaded (20)

Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in NashikUpturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in Nashik
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and MoreManyata Tech Park Bangalore_ Infrastructure, Facilities and More
Manyata Tech Park Bangalore_ Infrastructure, Facilities and More
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
 
Optimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptxOptimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptx
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
 

HBase Backups

  • 1. HBase Backups Backups in the Enterprise Jesse Yates Demai Ni Jing Chen He Richard Ding 1 HBase Backups - HBaseCon 2014
  • 2. Overview • Commonalities • IBM BigInsights • Backups at Salesforce.com • Summary 2 HBase Backups - HBaseCon 2014
  • 3. Commonalities • Per-Table Backups • Stored On HDFS • Full Backup + Incrementals • Fast Restore • Multiple Clusters • Timestamp file layout • Manifest Files for additional info • Merging Backups 3 HBase Backups - HBaseCon 2014
  • 4. IBM BigInsights HBase Backups - HBaseCon 20144
  • 5. Backup Solution - IBM • Customer Requirements • Feature Overview • Technical Design • User Interface: CLI and Web UI • Data Structures 5 HBase Backups - HBaseCon 2014
  • 6. Customer Requirements • Backup and Restore – Critical requirements from enterprise customers – General solution – Easy-to-use user interfaces: CLI and Web UI – Multiple file systems: HDFS and GPFS* – Multiple MR frameworks: Hadoop and PSMR* 6 HBase Backups - HBaseCon 2014 *GPFS: IBM General Parallel File System *PSMR: Platform Symphony MapReduce
  • 7. Feature Overview • Full Backup based on HBase Snapshot • Incremental Backup based on HBase transaction logs • Table-level Incremental Backup • Point-In-Time Restore • On-the-fly and Off-line Convert from HLogs to HFiles • Off-line Merge Backup Images • Self-contained Backup Image with Manifest File • Usability features: – progress, status, and history reports – purge old Backup Images 7 HBase Backups - HBaseCon 2014
  • 8. Technical Design - Overview • Object: Backup Image • Operations: – Full Backup – Incremental Backup – Convert – Merge – Restore HBase Backups - HBaseCon 20148
  • 9. Technical Design - Backup Images Full Backup Table1 (Monday) Full Backup Table2 (Tuesday) Incremental Backup [Table1, Table2] (Wednesday) Incremental Backup [Table1, Table2] (Thursday) depends depends depends HBase Backups - HBaseCon 20149
  • 10. Technical Design - Full Backup 10 HBase Backups - HBaseCon 2014 $ hbase backup create full hdfs://targetCluster.ibm.com:9000/hbasebackups biginsights:hbasecon_table1 Global Distributed WAL Roll Take Snapshot Track WAL Timestamp Through Zookeeper Export Snapshot Generate Manifest
  • 11. Technical Design - Incremental Backup 11 HBase Backups - HBaseCon 2014 $ hbase backup create incremental hdfs://targetCluster.ibm.com:9000/hbasebackups Global Distributed WAL Roll Track WAL Timestamp Through ZooKeeper DistCp WAL Logs into Backup Image Generate Manifest
  • 12. Technical Design - Restore 12 HBase Backups - HBaseCon 2014 $ hbase restore hdfs://targetCluster.ibm.com:9000/hbasebackups biginsights:hbasecon_table1 biginsights:hbasecon_table1_restore Create Table Pre-Split Using Manifest Info Bulk Load HFiles Full and Incremental Play WAL of Unconverted Hlogs Verify Lineage and Restore
  • 13. Technical Design - Convert 13 HBase Backups - HBaseCon 2014 $ hbase backup convert /hbasebackups backup_20140502_2100 full backup : backup_20140501_2100 Incremental backup backup_20140502_2100 /hbasebackups/biginsights/hbasecon_table1/ backup_20140501_2100/Metadata+HFiles backup_20140502_2100/Metadata /hbasebackups/biginsights/hbasecon_table2/ backup_20140501_2100/Metadata+HFiles backup_20140502_2100/Metadata /hbasebackups/WALs/ backup_20140502_2100/HLogs of ALL Tables Befor e
  • 14. Technical Design - Convert 14 HBase Backups - HBaseCon 2014 $ hbase backup convert /hbasebackups backup_20140502_2100 full backup : backup_20140501_2100 Incremental backup backup_20140502_2100 /hbasebackups/biginsights/hbasecon_table1/ backup_20140501_2100/Metadata+HFiles backup_20140502_2100/Metadata+HFiles /hbasebackups/biginsights/hbasecon_table2/ backup_20140501_2100/Metadata+HFiles backup_20140502_2100/Metadata+HFiles /hbasebackups/WALs/ backup_20140502_2100/ After
  • 15. Technical Design - Merge 15 HBase Backups - HBaseCon 2014 $ hbase backup merge /hbasebackups biginsights:hbasecon_table1 backup_20140501_2100 backup_20140502_2100 Full backup: backup_20140501_2100 Incremental backup: backup_20140502_2100 /hbasebackups/biginsights/hbasecon_table1/ backup_20140501_2100/ backup_20140502_2100/ /hbasebackups/biginsights/hbasecon_table1/ backup_20140502_2100/ TimeStamp 2 TimeStamp 1 TimeStamp 2
  • 16. User Interface - CLI $ hbase backup help Usage: hbase backup COMMAND where COMMAND is one of: create create a new backup cancel cancel an ongoing backup delete delete an existing backup describe show the detailed information of a backup history show history of all successful backups status show the status of the latest backup request convert convert incremental backup WAL files into HFiles merge merge backup images stop remove table(s) from backup table set show show table(s) in backup table set Enter 'help COMMAND' to see help message for each command 16 HBase Backups - HBaseCon 2014
  • 17. User Interface – Web UI Backup 17 HBase Backups - HBaseCon 2014
  • 18. User Interface – Web UI Restore 18 HBase Backups - HBaseCon 2014
  • 19. Data Structure - Backup Image • Table Info and Region Info • Backup Manifest – Table Name – Type: Full or Incremental – Size – Timestamp Info – State Info: Converted, Merged, Compacted, etc. – Dependency Lineage • Data – HFiles – WALs (For Incremental Backup before convert) 19 HBase Backups - HBaseCon 2014
  • 20. Data Structure - ZooKeeper/backup/hbase startcode {backup marker} complete/ backupId_1 {contains backup metadata} …… backupId_n ongoing {contains the progress status of the current operation} failed {contains error code and message of the current operation} cancel {triggers a cancel operation } incr/ tablelogtimestamp/ table_1 {list of region servers and associated log timestamp for this table} …… table_n last-roll-log-ts/ rs_1 {contains the log timestamp from last roll log} …… rs_n 20 HBase Backups - HBaseCon 2014
  • 21. HBase Backups - HBaseCon 2014 Sincere gratitude is hereby extended to the following developers who contributed to this effort: Richard Ding, Jing Chen He, Enoch Hsu, Yu Li, Jihong Ma, Demai Ni, Kan Zhang, Liping Zhang, Xiang Zhou * ordered by last name 21
  • 22. Salesforce.com Backups HBase Backups - HBaseCon 2014 Jesse Yates 22
  • 23. Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, risks associated with possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year ended January 31, 2011. This document and others are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements. 23 HBase Backups - HBaseCon 2014 Safe Harbor
  • 24. Salesforce Environment • Many tenants per cluster • At least 90 days of recovery • DR failover to remote DC • All writes through Phoenix – Timestamp control 24 HBase Backups - HBaseCon 2014
  • 25. Design Goals • Validate backups regularly • Minimize time to restore a tenant • Validate replication is up to date • Minimize data storage 25 HBase Backups - HBaseCon 2014
  • 26. Backups • M/R a table at a given point in time – Point-in-time view of the table • Chunked by file size + tenant (per server) • Chunk manifest – Chunk info (min/max/hash/tenant ids) 26 HBase Backups - HBaseCon 2014
  • 27. Backups 27 HBase Backups - HBaseCon 2014 Key CF CQ TS Value user1_a fam qual 14 value10 user1_a fam qual 12 Value5 user1_a fam qual 10 Valu2 user1_a fam qual 8 value4 user1_a fam qual 3 value13 user1_a fam qual 2 value56 1. http://phoenix.incubator.apache.org/
  • 28. Backups 28 HBase Backups - HBaseCon 2014 Some HBase Table M M M M M M M Hadoop Distributed File System
  • 29. Backups • Each backup is an incremental – Lineage by convention • Never write too far back in time • Data retained by custom coprocessor – Retained up to last successful backup 29 HBase Backups - HBaseCon 2014
  • 30. “Backup isn’t a backup until you’ve restored it and tested it” -- Some Ops Guy 30 HBase Backups - HBaseCon 2014
  • 31. Restore + Validation • Restore each backup to a new table • Validate that backup has same data a existing table – Within backup timerange • Move ‘retained timestamps’ forward 31 HBase Backups - HBaseCon 2014
  • 32. Restore 32 HBase Backups - HBaseCon 2014 HDFS /hbase … /salesforce /backup /somehbasetable /03/14/14 backup.properties chunk1 chunk1.manifest …. chunk1000 chunk1000.manifest M M M SomeHBaseTable_Restore
  • 33. Restore • Configurable validation percent – Start high, move lower • Backup only valid if restore is successful 33 HBase Backups - HBaseCon 2014
  • 34. 34 HBase Backups - HBaseCon 2014 90 Days of Backup is LOTS of Data Even without any duplicates!
  • 35. Granularity Reduction • Combine backups every ‘period’ – Week, month, 3 months – Specified in table metadata • Keep latest version of the row • Helpful with lots of updates – Not useful for unique data (e.g. time series) 35 HBase Backups - HBaseCon 2014
  • 36. Granularity Reduction 36 HBase Backups - HBaseCon 2014 HDFS /salesforce /backup /somehbasetable /03-14-14 /03-13-14 … /03-07-14 /03-01_07-14 /02-23_28-14 /02-16_24-14 /02-09_15-14 /01-14 /12-13 /11-13 /base M M M HDFS /salesforce /03-07_14-14 /03-01_07-14 /02-14 /01-14 /12-13 /base
  • 37. HDFS Granularity Reduction 37 HBase Backups - HBaseCon 2014 HDFS /salesforce /backup /somehbasetable /03-14-14 /03-13-14 … /03-07-14 /03-01_07-14 /02-23_28-14 /02-16_24-14 /02-09_15-14 /01-14 M M M Weekly Merge Monthly Merge /salesforce /03-07_14-14 /03-01_07-14 /02-14 /01-14 /12-13 /base Rebuilt Base
  • 38. 38 HBase Backups - HBaseCon 2014 Meanwhile… Remember that DR site?
  • 39. Disaster Recovery 39 HBase Backups - HBaseCon 2014 Primary Data Center Buddy (DR) Data Center
  • 40. Validation By Backup • Validate replication is working • Validate backup process consistent • Validate granularity reduction consistent 40 HBase Backups - HBaseCon 2014
  • 41. Validation By Backup • Build up hash of hashes – Two level Merkle Tree • Check that both DCs have the same hash – Can easily identify differences per-manifest • Requires time-delay for backups – <= replication delay 41 HBase Backups - HBaseCon 2014
  • 42. Hash Validation 42 HBase Backups - HBaseCon 2014 Backup Manifest • chunk size • start time • end time • combined hash • version Chunk Manifest • key prefix • stats • hash Chunk Manifest • key prefix • stats • hash … Primary Data Center Backup Manifest • chunk size • start time • end time • combined hash • version Chunk Manifest • key prefix • stats • hash Chunk Manifest • key prefix • stats • hash … Buddy Data Center Mismatch!
  • 43. Tracking Status • Daily emails • Progress stored in Phoenix Table • Easy access for auditing • Easy display for UI (coming soon) 43 HBase Backups - HBaseCon 2014
  • 44. Future Work • Extensive tooling around per-tenant restore • M/R from snapshot 44 HBase Backups - HBaseCon 2014
  • 45. Lessons Learned • Track Properties – Version, table, lineage, etc • Fast Restore is Important – Consider your business case • Validation! 45 HBase Backups - HBaseCon 2014
  • 46. Special Thanks All the members of the Salesforce HBase team, particularly: Vasu Mariyala, Sukumar Maddineni, Alex Araujo, Lars Hofhansl, Ian Varley, Santosh Rau 46 HBase Backups - HBaseCon 2014
  • 47. Summary • Per-Table Backups • IBM – WAL based – Extra tooling for fast restores – Extensive lineage tracking • Salesforce – M/R over HTable – Multi-tenant – Multiple Validation vectors 47 HBase Backups - HBaseCon 2014
  • 48. 48 HBase Backups - HBaseCon 2014 Thanks! Questions? Jesse Yates Demai Ni Jing He Chen Richard Ding

Editor's Notes

  1. Provides a snapshot of the table from time 11 backwards. Even if we are writing to the table from the client, we won’t see any of those updates. Caveat of special CPs that ensure we don’t lose data that we haven’t backed up yet (at cost of some extra versions everyday)