SlideShare a Scribd company logo
1 of 41
HBase Cross-site BigTable
Security Features
in Apache HBase –
An Operator’s
Guide
Cross-Site Big Table using HBase
Anoop Sam John, Du Jingcheng, Ramkrishna S. Vasudevan
Big Data US Research And Development, Intel
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Motivation
Motivation
• Growing demands for storing the data across geographically
distributed data centers.
– Data and data pattern is similar across data centers.
– But the data is private to each of the data center.
• Improve the data availability and disaster recovery.
• An easy way to access these distributed data.
• Manage the hierarchy relationship between data centers. (Grouping of
data centers)
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Use Case
Intelligent Transportation System
• Monitors traffic movements, traffic patterns etc. in every city.
• Data in every data center is private and holds traffic pattern of that city
• Hierarchy of departments - National Transportation Department/State
Transportation Department/City Transportation Department
• National/State Transportation Department – Virtual node
• Helps to aggregate results/statistics over all the data centers.
• Easy access and single point of view of all the data centers.
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Agenda
Agenda
• Goals of CSBT[1]
• Architecture of CSBT
– Highly Available Global Zookeeper Quorum
– Cross-Site Metadata in Global Zookeeper
– Cluster Locator
– Hierarchy
• Admin Operations on CSBT
• Read/Write operations on CrossSiteHTable
• Data Replication and FailOver
• Future Improvements
[1] – CSBT refers to Cross-Site Big Table
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Goals
Goals
• A global view for tables across different data centers.
• Define and manage the hierarchy relationship between data centers
and data.
• High availability
• Locality – In terms of geography
– Each data center holds its own data.
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Architecture
Architecture
CSBTCSBT
Architecture
• An across data center dedicated, distributed zookeeper quorum –
Global Zookeeper
• Table partitioning
– Each data center holds a specific partition of the table
– Every partition of the Cross-site HTable is an HTable itself, bearing a
table name “<tableName>_<clustername>”
– The partitioning rule is set by user at table creation time using
Cluster Locators.
• Supports all admin and table operations as supported on the HTable.
Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services (Source :
http://zookeeper.apache.org/)
Architecture
• Data center relationship
– Allows each data center to configure its peer data center (for
replication and failover in read)
• Peers for a cluster could be nodes in another cluster also.
• Master-Master, Master-slave replication etc.
– Uses Asynchronous Replication of Apache HBase
• Asynchronously writes the WAL entries to the configured peers
– Could define hierarchy for the data centers
Highly Available Global Zookeeper Quorum
• Dedicated Zookeeper cluster.
• Split the Zookeeper quorum across data centers.
• Recommended not to use the zookeeper cluster used by the individualHBase setups.
• Leverage the Zookeeper Observer
– Do not impact the Zookeeper write performance
– Configure in such a way that ‘reads’ are served locally
– Configure the Zookeeper quorum as
<local observers>,<leader/followers>,<observers in other DCs>
Observers are non-voting members of an ensemble which only hear the results of votes
Observers may be used to talk to an Apache ZooKeeper server from another data center
Clients of the Observer will see fast reads, as all reads are served locally, and writes result in minimal network traffic as the number of messages required in the absence of
the vote protocol is smaller (Source : http://zookeeper.apache.org/)
Cross-Site Metadata in Global Zookeeper
CrossSite
clusters
tables
address
hierarchy
state
splitkeys
desc
proposed_desc
peerscluster1
cluster2
cluster3
table1
table2
table3
locator
Cluster Locator
• Sets data partition logic/rule.
• Helps to locate a specific cluster based on the row key.
• Users are allowed to create their own cluster locators
– PrefixClusterLocator – <clustername>,<row>
where “,” is the delimiter
– SuffixClusterLocator – <row>,<xxx>,<yyy>,<clustername>
where “,” is the delimiter and cluster name is always the string that
appears after the occurrence of the last delimiter.
• Note that its up to the user to specify the cluster name in the row key while
doing ‘puts’ based on the cluster locator configured while table creation.
Hierarchy
Hierarchy
• Could define the parent-child relationship for clusters
• The node in the hierarchy could be either a physical cluster, or a virtual
one. A virtual node may represent a logical grouping of a set of
physical clusters
• The hierarchy is used while ‘scan’ing
– If a parent node is specified, all its descendants are also counted
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Admin operations on Cross-site BigTable
Admin Operations
• Operation performed using CrossSiteHBaseAdmin
• Extends HBaseAdmin.
Create Peers
• Specifying peers creates the
peers under the ‘peers’ node
• Address of each peer is written
as data in the peer znodes
peers cluster2
cluster3
cluster4
cluster1
Create Table
Cluster a01
Table:
T1_a01
Cluster a02
Table:
T1_a02
Peer1
Table:
T1_a01
(backup)
CSBTAdmin Global ZK
Cluster a01->Peer1
Cluster a02->Peer2
Peer2
Table:
T1_a02
(backup)
1. Create the table znode in
ZK
4. Writes the table related data in
table’s znode and updates the state
in zk
2. Create tables in clusters
3. Create table in peers if any
Disable Table
Cluster a01
Table:
T1_a01
Cluster a02
Table:
T1_a02
Peer1
Table:
T1_a01
(backup)
CSBTAdmin Global ZK
Cluster a01->Peer1
Cluster a02->Peer2
Peer2
Table:
T1_a02
(backup)
1. Update the state to
DISABLING
3. Update the state to
DISABLED
2. Disable tables in clusters
• Do NOT disable the tables in the peers - As it is asynchronous replication, disabling peer may stop
the entire replication. There may be some unfinished WALs from getting replicated
Enable Table
Cluster a01
Table:
T1_a01
Cluster a02
Table:
T1_a02
Peer1
Table:
T1_a01
(backup)
CSBTAdmin Global ZK
Cluster a01->Peer1
Cluster a02->Peer2
Peer2
Table:
T1_a02
(backup)
1. Update the state to
ENABLING
4. Update the state to
ENABLED
2. Enable tables in clusters
3. Handle TableNotDisabledException
as Peers already ENABLED
Alter Schema
Cluster a01
Table:
T1_a01
Cluster a02
Table:
T1_a02
Peer1
Table:
T1_a01
(backup)
CSBTAdmin Global ZK
Cluster a01->Peer1
Cluster a02->Peer2
Peer2
Table:
T1_a02
(backup)
1. Write the new HTD to PROPOSED_DESC
znode
3. Update the table’s HTD znode
4. Update table state to DISABLED
5. Delete the PROPOSED_DESC
znode
3. Alter schema in clusters
4. Add/Modify column in peers by
DISABLING. ENABLE after completion. If
table not present create the table with the
new HTD.
2. Update the state to MODIFYING/ADDING xxx
state
Delete Table
Cluster a01
Table:
T1_a01
Cluster a02
Table:
T1_a02
Peer1
Table:
T1_a01
(backup)
CSBTAdmin Global ZK
Cluster a01->Peer1
Cluster a02->Peer2
Peer2
Table:
T1_a02
(backup)
1. Update the state to
DELETING
4. Remove the table from the zk
2. Delete tables in clusters
3. Disable and Delete the tables from the
peer
Failure handling
• Failures are handled for the create/enable/disable/delete table by
using ZK states. Any failure the entire operation has to be retried.
• A tool that helps to deduce and auto-correct inconsistencies in the
CSBT cluster in terms of table state.
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Read/Write operations on CrossSiteHTable
Operations using CrossSiteHTable
• Operations like put/get/scan/delete performed using CrossSiteHTable
• Extends HTable
Get/Put/Delete
• Get/Put/Delete “a01, row1” from table T1
Cluster a01
Table:
T1_a01
Cluster a02
Table:
T1_a02
CSBTHTable
Global ZK
1. retrieve cluster locator for
table “T1”(cached)
2. map “a01,row1” to cluster “a01”
3. find address for cluster
“a01” (cached)
4. do get/put/delete(“a01,row1”)
on table “T1_a01” from cluster
“a01”
Scan with Start/Stop row
• New scan APIs added where cluster names could be passed while creating scans
• Scan from table T1 [ start – “row1”, end – “row6” ] , clusters-[cluster a01, cluster a02]
Cluster a01
Table:
T1_a01
CSBTHTable
Global ZK
1. retrieve cluster info for
table “T1”(cached)
2. find address for cluster
“a01” and “a02” (cached)
3. scan from(“a01,row1”) to
(“a01,row6) on table “T1_a01”
from cluster “a01”
4. scan from(“a02,row1”) to
(“a02,row6) on table “T1_a02”
from cluster “a02”
Cluster a02
Table:
T1_a02
Cluster a03
Table:
T1_a03
Scan with Hierarchy
Scan from table T1 [ start – “row1”, end – “row6” ] , clusters-
[California]
California – virtual node
SFO, LA, San Diego – physical nodes
Scan
• Uses a merge sort iterator to merge the results from different clusters
Client
Merge(sort) Iterator
Cluster A Cluster B Cluster Zall clusters …
Operations on CSBT
• The admin operations have shell and thrift support.
• Also supports MapReduce for operations on CrossSiteBigTable.
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Data Replication and FailOver
Data Center Relationship
• Allows data centers to add peers
• Apache HBase replication
– Asynchronous data replication
– Customized replication sink for CSBT
• Read-only failover
– Automatically redirects the read to the peer center
• Existing data not getting replicated for dynamic peer addition.
Data Replication
Cluster “a01”
Table:
T1_a01
Table:
T1_a02’
(backup)
Cluster “a02”
Table:
T1_a03’
(backup)
Table:
T1_a02
Cluster “a03”
Table:
T1_a01’
(backup)
Table:
T1_a03
CSBTHTable
replicate
replicate
replicate
put
put
put
Read-only Failover
Cluster “a01”
Table:
T1_a01
Table:
T1_a02’
(backup)
Cluster “a02”
Table:
T1_a03’
(backup)
Table:
T1_a02
Cluster “a03”
Table:
T1_a01’
(backup)
Table:
T1_a03
CSBTHTable
failover to
backup DC
get/scan
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Future improvements
Future improvements
• Security – CSBT security and how user/group authentications interact
• MR improvement
• Full fledged CSBT HBCK.
Currently the MR tasks runs in one cluster and all the result computation happens in one cluster.
We could improve this by dispatching the task to each cluster and then collect the results from them.
Partitioning Rule
• A rule to parse row keys, help to map records to
different clusters. ClusterLocator provides this facility
which is recorded in the central ZK
– PrefixClusterLocator
– SuffixClusterLocator
– …
• An example of PrefixClusterLocator
– If a row key is “clusterA,rowKey1”, then this record belongs
to clusterA
Q & A

More Related Content

What's hot

HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, ClouderaHBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, ClouderaCloudera, Inc.
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBaseCon
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketCloudera, Inc.
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationSchubert Zhang
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBaseHBaseCon
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsCloudera, Inc.
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
State of HBase: Meet the Release Managers
State of HBase: Meet the Release ManagersState of HBase: Meet the Release Managers
State of HBase: Meet the Release ManagersHBaseCon
 

What's hot (18)

HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, ClouderaHBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table Snapshots
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
State of HBase: Meet the Release Managers
State of HBase: Meet the Release ManagersState of HBase: Meet the Release Managers
State of HBase: Meet the Release Managers
 

Viewers also liked

HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterCloudera, Inc.
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!Cloudera, Inc.
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseCloudera, Inc.
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics Cloudera, Inc.
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesCloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseCloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.Cloudera, Inc.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponCloudera, Inc.
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARNHBaseCon
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...Cloudera, Inc.
 

Viewers also liked (20)

HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
 

Similar to Cross-Site BigTable using HBase

Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathJoshua McKenzie
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache CassandraJacky Chu
 
Introducing Oxia: A Scalable Zookeeper Alternative
Introducing Oxia: A Scalable Zookeeper AlternativeIntroducing Oxia: A Scalable Zookeeper Alternative
Introducing Oxia: A Scalable Zookeeper AlternativeHostedbyConfluent
 
Galera explained 3
Galera explained 3Galera explained 3
Galera explained 3Marco Tusa
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
Elasticsearch Data Analyses
Elasticsearch Data AnalysesElasticsearch Data Analyses
Elasticsearch Data AnalysesAlaa Elhadba
 
My sql technical reference manual
My sql technical reference manualMy sql technical reference manual
My sql technical reference manualMir Majid
 
Introduction To Maxtable
Introduction To MaxtableIntroduction To Maxtable
Introduction To Maxtablemaxtable
 
1.8 Data Protection.pdf
1.8 Data Protection.pdf1.8 Data Protection.pdf
1.8 Data Protection.pdfssuser8b6c85
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastSingleStore
 
Performence tuning
Performence tuningPerformence tuning
Performence tuningVasudeva Rao
 
Cassandra - Research Paper Overview
Cassandra - Research Paper OverviewCassandra - Research Paper Overview
Cassandra - Research Paper Overviewsameiralk
 

Similar to Cross-Site BigTable using HBase (20)

Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write path
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Introducing Oxia: A Scalable Zookeeper Alternative
Introducing Oxia: A Scalable Zookeeper AlternativeIntroducing Oxia: A Scalable Zookeeper Alternative
Introducing Oxia: A Scalable Zookeeper Alternative
 
Galera explained 3
Galera explained 3Galera explained 3
Galera explained 3
 
01 hbase
01 hbase01 hbase
01 hbase
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Hazelcast
HazelcastHazelcast
Hazelcast
 
1650607.ppt
1650607.ppt1650607.ppt
1650607.ppt
 
Cassandra
CassandraCassandra
Cassandra
 
Elasticsearch Data Analyses
Elasticsearch Data AnalysesElasticsearch Data Analyses
Elasticsearch Data Analyses
 
My sql technical reference manual
My sql technical reference manualMy sql technical reference manual
My sql technical reference manual
 
Introduction To Maxtable
Introduction To MaxtableIntroduction To Maxtable
Introduction To Maxtable
 
1.8 Data Protection.pdf
1.8 Data Protection.pdf1.8 Data Protection.pdf
1.8 Data Protection.pdf
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
Performence tuning
Performence tuningPerformence tuning
Performence tuning
 
Cassandra - Research Paper Overview
Cassandra - Research Paper OverviewCassandra - Research Paper Overview
Cassandra - Research Paper Overview
 

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Recently uploaded

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 

Recently uploaded (20)

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 

Cross-Site BigTable using HBase

  • 1. HBase Cross-site BigTable Security Features in Apache HBase – An Operator’s Guide Cross-Site Big Table using HBase Anoop Sam John, Du Jingcheng, Ramkrishna S. Vasudevan Big Data US Research And Development, Intel
  • 2. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Motivation
  • 3. Motivation • Growing demands for storing the data across geographically distributed data centers. – Data and data pattern is similar across data centers. – But the data is private to each of the data center. • Improve the data availability and disaster recovery. • An easy way to access these distributed data. • Manage the hierarchy relationship between data centers. (Grouping of data centers)
  • 4. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Use Case
  • 5. Intelligent Transportation System • Monitors traffic movements, traffic patterns etc. in every city. • Data in every data center is private and holds traffic pattern of that city • Hierarchy of departments - National Transportation Department/State Transportation Department/City Transportation Department • National/State Transportation Department – Virtual node • Helps to aggregate results/statistics over all the data centers. • Easy access and single point of view of all the data centers.
  • 6. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Agenda
  • 7. Agenda • Goals of CSBT[1] • Architecture of CSBT – Highly Available Global Zookeeper Quorum – Cross-Site Metadata in Global Zookeeper – Cluster Locator – Hierarchy • Admin Operations on CSBT • Read/Write operations on CrossSiteHTable • Data Replication and FailOver • Future Improvements [1] – CSBT refers to Cross-Site Big Table
  • 8. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Goals
  • 9. Goals • A global view for tables across different data centers. • Define and manage the hierarchy relationship between data centers and data. • High availability • Locality – In terms of geography – Each data center holds its own data.
  • 10. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Architecture
  • 12. Architecture • An across data center dedicated, distributed zookeeper quorum – Global Zookeeper • Table partitioning – Each data center holds a specific partition of the table – Every partition of the Cross-site HTable is an HTable itself, bearing a table name “<tableName>_<clustername>” – The partitioning rule is set by user at table creation time using Cluster Locators. • Supports all admin and table operations as supported on the HTable. Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services (Source : http://zookeeper.apache.org/)
  • 13. Architecture • Data center relationship – Allows each data center to configure its peer data center (for replication and failover in read) • Peers for a cluster could be nodes in another cluster also. • Master-Master, Master-slave replication etc. – Uses Asynchronous Replication of Apache HBase • Asynchronously writes the WAL entries to the configured peers – Could define hierarchy for the data centers
  • 14. Highly Available Global Zookeeper Quorum • Dedicated Zookeeper cluster. • Split the Zookeeper quorum across data centers. • Recommended not to use the zookeeper cluster used by the individualHBase setups. • Leverage the Zookeeper Observer – Do not impact the Zookeeper write performance – Configure in such a way that ‘reads’ are served locally – Configure the Zookeeper quorum as <local observers>,<leader/followers>,<observers in other DCs> Observers are non-voting members of an ensemble which only hear the results of votes Observers may be used to talk to an Apache ZooKeeper server from another data center Clients of the Observer will see fast reads, as all reads are served locally, and writes result in minimal network traffic as the number of messages required in the absence of the vote protocol is smaller (Source : http://zookeeper.apache.org/)
  • 15. Cross-Site Metadata in Global Zookeeper CrossSite clusters tables address hierarchy state splitkeys desc proposed_desc peerscluster1 cluster2 cluster3 table1 table2 table3 locator
  • 16. Cluster Locator • Sets data partition logic/rule. • Helps to locate a specific cluster based on the row key. • Users are allowed to create their own cluster locators – PrefixClusterLocator – <clustername>,<row> where “,” is the delimiter – SuffixClusterLocator – <row>,<xxx>,<yyy>,<clustername> where “,” is the delimiter and cluster name is always the string that appears after the occurrence of the last delimiter. • Note that its up to the user to specify the cluster name in the row key while doing ‘puts’ based on the cluster locator configured while table creation.
  • 18. Hierarchy • Could define the parent-child relationship for clusters • The node in the hierarchy could be either a physical cluster, or a virtual one. A virtual node may represent a logical grouping of a set of physical clusters • The hierarchy is used while ‘scan’ing – If a parent node is specified, all its descendants are also counted
  • 19. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Admin operations on Cross-site BigTable
  • 20. Admin Operations • Operation performed using CrossSiteHBaseAdmin • Extends HBaseAdmin.
  • 21. Create Peers • Specifying peers creates the peers under the ‘peers’ node • Address of each peer is written as data in the peer znodes peers cluster2 cluster3 cluster4 cluster1
  • 22. Create Table Cluster a01 Table: T1_a01 Cluster a02 Table: T1_a02 Peer1 Table: T1_a01 (backup) CSBTAdmin Global ZK Cluster a01->Peer1 Cluster a02->Peer2 Peer2 Table: T1_a02 (backup) 1. Create the table znode in ZK 4. Writes the table related data in table’s znode and updates the state in zk 2. Create tables in clusters 3. Create table in peers if any
  • 23. Disable Table Cluster a01 Table: T1_a01 Cluster a02 Table: T1_a02 Peer1 Table: T1_a01 (backup) CSBTAdmin Global ZK Cluster a01->Peer1 Cluster a02->Peer2 Peer2 Table: T1_a02 (backup) 1. Update the state to DISABLING 3. Update the state to DISABLED 2. Disable tables in clusters • Do NOT disable the tables in the peers - As it is asynchronous replication, disabling peer may stop the entire replication. There may be some unfinished WALs from getting replicated
  • 24. Enable Table Cluster a01 Table: T1_a01 Cluster a02 Table: T1_a02 Peer1 Table: T1_a01 (backup) CSBTAdmin Global ZK Cluster a01->Peer1 Cluster a02->Peer2 Peer2 Table: T1_a02 (backup) 1. Update the state to ENABLING 4. Update the state to ENABLED 2. Enable tables in clusters 3. Handle TableNotDisabledException as Peers already ENABLED
  • 25. Alter Schema Cluster a01 Table: T1_a01 Cluster a02 Table: T1_a02 Peer1 Table: T1_a01 (backup) CSBTAdmin Global ZK Cluster a01->Peer1 Cluster a02->Peer2 Peer2 Table: T1_a02 (backup) 1. Write the new HTD to PROPOSED_DESC znode 3. Update the table’s HTD znode 4. Update table state to DISABLED 5. Delete the PROPOSED_DESC znode 3. Alter schema in clusters 4. Add/Modify column in peers by DISABLING. ENABLE after completion. If table not present create the table with the new HTD. 2. Update the state to MODIFYING/ADDING xxx state
  • 26. Delete Table Cluster a01 Table: T1_a01 Cluster a02 Table: T1_a02 Peer1 Table: T1_a01 (backup) CSBTAdmin Global ZK Cluster a01->Peer1 Cluster a02->Peer2 Peer2 Table: T1_a02 (backup) 1. Update the state to DELETING 4. Remove the table from the zk 2. Delete tables in clusters 3. Disable and Delete the tables from the peer
  • 27. Failure handling • Failures are handled for the create/enable/disable/delete table by using ZK states. Any failure the entire operation has to be retried. • A tool that helps to deduce and auto-correct inconsistencies in the CSBT cluster in terms of table state.
  • 28. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Read/Write operations on CrossSiteHTable
  • 29. Operations using CrossSiteHTable • Operations like put/get/scan/delete performed using CrossSiteHTable • Extends HTable
  • 30. Get/Put/Delete • Get/Put/Delete “a01, row1” from table T1 Cluster a01 Table: T1_a01 Cluster a02 Table: T1_a02 CSBTHTable Global ZK 1. retrieve cluster locator for table “T1”(cached) 2. map “a01,row1” to cluster “a01” 3. find address for cluster “a01” (cached) 4. do get/put/delete(“a01,row1”) on table “T1_a01” from cluster “a01”
  • 31. Scan with Start/Stop row • New scan APIs added where cluster names could be passed while creating scans • Scan from table T1 [ start – “row1”, end – “row6” ] , clusters-[cluster a01, cluster a02] Cluster a01 Table: T1_a01 CSBTHTable Global ZK 1. retrieve cluster info for table “T1”(cached) 2. find address for cluster “a01” and “a02” (cached) 3. scan from(“a01,row1”) to (“a01,row6) on table “T1_a01” from cluster “a01” 4. scan from(“a02,row1”) to (“a02,row6) on table “T1_a02” from cluster “a02” Cluster a02 Table: T1_a02 Cluster a03 Table: T1_a03
  • 32. Scan with Hierarchy Scan from table T1 [ start – “row1”, end – “row6” ] , clusters- [California] California – virtual node SFO, LA, San Diego – physical nodes
  • 33. Scan • Uses a merge sort iterator to merge the results from different clusters Client Merge(sort) Iterator Cluster A Cluster B Cluster Zall clusters …
  • 34. Operations on CSBT • The admin operations have shell and thrift support. • Also supports MapReduce for operations on CrossSiteBigTable.
  • 35. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Data Replication and FailOver
  • 36. Data Center Relationship • Allows data centers to add peers • Apache HBase replication – Asynchronous data replication – Customized replication sink for CSBT • Read-only failover – Automatically redirects the read to the peer center • Existing data not getting replicated for dynamic peer addition.
  • 37. Data Replication Cluster “a01” Table: T1_a01 Table: T1_a02’ (backup) Cluster “a02” Table: T1_a03’ (backup) Table: T1_a02 Cluster “a03” Table: T1_a01’ (backup) Table: T1_a03 CSBTHTable replicate replicate replicate put put put
  • 38. Read-only Failover Cluster “a01” Table: T1_a01 Table: T1_a02’ (backup) Cluster “a02” Table: T1_a03’ (backup) Table: T1_a02 Cluster “a03” Table: T1_a01’ (backup) Table: T1_a03 CSBTHTable failover to backup DC get/scan
  • 39. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Future improvements
  • 40. Future improvements • Security – CSBT security and how user/group authentications interact • MR improvement • Full fledged CSBT HBCK. Currently the MR tasks runs in one cluster and all the result computation happens in one cluster. We could improve this by dispatching the task to each cluster and then collect the results from them.
  • 41. Partitioning Rule • A rule to parse row keys, help to map records to different clusters. ClusterLocator provides this facility which is recorded in the central ZK – PrefixClusterLocator – SuffixClusterLocator – … • An example of PrefixClusterLocator – If a row key is “clusterA,rowKey1”, then this record belongs to clusterA Q & A