SlideShare a Scribd company logo
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HBase Read High Availability Using
Timeline-Consistent Region Replicas
Enis Soztutar (enis@hortonworks.com)
Devaraj Das (ddas@hortonworks.com)
June 2014
Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
About Us
Enis Soztutar
Committer and PMC member in Apache
HBase and Hadoop since 2007
HBase team @Hortonworks
Twitter @enissoz
Devaraj Das
Committer and PMC member in
Hadoop since 2006
Committer at HBase
Co-founder @Hortonworks
Twitter @ddraj
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Outline of the talk
PART I: Use case and semantics
 CAP recap
 Use case and motivation
 Region replicas
 Timeline consistency
 Semantics
PART II : Implementation and next steps
 Server side
 Client side
 Data replication
 Next steps & Summary
Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part I
Use case and semantics
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
CAP reCAP
Partition tolerance
Consistency Availability
Pick Two
HBase is CP
Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Availability
CAP reCAP
• In a distributed system you cannot NOT have P
• C vs A is about what happens if there is a network
partition!
• A an C are NEVER binary values, always a range
• Different operations in the system can have
different A / C choices
• HBase cannot be simplified as CP
Partition tolerance
Consistency
Pick Two
HBase is CP
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HBase consistency model
For a single row, HBase is strongly consistent within a data center
Across rows HBase is not strongly consistent (but available!).
When a RS goes down, only the regions on that server become
unavailable. Other regions are unaffected.
HBase multi-DC replication is “eventual consistent”
HBase applications should carefully design the schema for correct
semantics / performance tradeoff
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Use cases and motivation
More and more applications are looking for a “0 down time” platform
 30 seconds downtime (aggressive MTTR time) is too much
Certain classes of apps are willing to tolerate decreased consistency
guarantees in favor of availability
 Especially for READs
Some build wrappers around the native API to be able to handle failures of
destination servers
 Multi-DC: when one server is down in one DC, the client switches to a different one
Can we do something in HBase natively?
 Within the same cluster?
Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Use cases and motivation
Designing the application requires careful tradeoff consideration
 In schema design since single-row is strong consistent, but no multi-row trx
 Multi-datacenter replication (active-passive, active-active, backups etc)
It is good to be able to give the application flexibility to pick-and-choose
 Higher availability vs stronger consistency
Read vs Write
 Different consistency models for read vs write
 Read-repair, latest ts-wins vs linearizable updates
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Initial goals
Support applications talking to a single cluster really well
 No perceived downtime
 Only for READs
If apps wants to tolerate cluster failures
 Use HBase replication
 Combine that with wrappers in the application
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Introducing….
Region Replicas in HBase
Timeline Consistency in HBase
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Region replicas
For every region of the table, there can be more than one replica
 Every region replica has an associated “replica_id”, starting from 0
 Each region replica is hosted by a different region server
Tables can be configured with a REGION_REPLICATION parameter
 Default is 1
 No change in the current behavior
One replica per region is the “default” or “primary”
 Only this can accepts WRITEs
 All reads from this region replica return the most recent data
Other replicas, also called “secondaries” follow the primary
 They see only committed updates
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Region replicas
Secondary region replicas are read-only
 No writes are routed to secondary replicas
 Data is replicated to secondary regions (more on this later)
 Serve data from the same data files are primary
 May not have received the recent data
 Reads and Scans can be performed, returning possibly stale data
Region replica placement is done to maximize availability of any particular
region
 Region replicas are not co-located on same region servers
 And same racks (if possible)
Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
rowkey column:value column:value …
RegionServer
Region
memstore
DataNode
b2
b9 b1
DataNode
b2
b1
DataNode
b1
Client
Read and write
RegionServer
Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Page 15
rowkey column:value column:value …
RegionServer
Region
DataNode
b2
b9 b1
DataNode
b2
b1
DataNode
b1
Client
Read and write
memstore
RegionServer
rowkey column:value column:value …
memstore
Region replica
Read only
Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency
Introduced a Consistency enum
 STRONG
 TIMELINE
Consistency.STRONG is default
Consistency can be set per read operation (per-get or per-scan)
Timeline-consistent read RPCs sent to more than one replica
Semantics is a bit different than Eventual Consistency model
Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency
public enum Consistency {
STRONG,
TIMELINE
}
Get get = new Get(row);
get.setConsistency(Consistency.TIMELINE);
...
Result result = table.get(get);
…
if (result.isStale()) {
...
}
Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Semantics
Can be though of as in-cluster active-passive replication
Single homed and ordered updates
 All writes are handled and ordered by the primary region
 All writes are STRONG consistency
Secondaries apply the mutations in order
Only get/scan requests to secondaries
Get/Scan Result can be inspected to see whether the result was from
possibly stale data
Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
X=3
WAL
Data:
WAL
Data:
X=1X=1Write
Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
X=3
WAL
Data:
WAL
Data:
X=1
X=1
X=1
X=1
X=1
X=1Read
X=1Read
X=1Read
Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
Write
X=1
X=1
X=2 X=2
X=2
Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
X=2
X=1
X=2
X=2
X=2
X=2Read
X=2Read
X=1Read
Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
X=2
X=1
X=3
X=2
Write X=3
X=3
Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
X=2
X=1
X=3
X=2 X=3
X=3Read
X=2Read
X=1Read
Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
PART II
Implementation and next steps
Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Region replicas – recap
Every region replica has an associated “replica_id”, starting from 0
Each region replica is hosted by a different region server
 All replicas can serve READs
One replica per region is the “default” or “primary”
 Only this can accepts WRITEs
 All reads from this region replica return the most recent data
Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Updates in the Master
Replica creation
 Created during table creation
No distinction between primary & secondary replicas
Meta table contain all information in one row
Load balancer improvements
 LB made aware of replicas
 Does best effort to place replicas in machines/racks to maximize availability
Alter table support
 For adjusting number of replicas
Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Replication
Data should be replicated from primary regions to secondary regions
Data files MUST be shared. We do not want to store multiple copies
Do not cause more writes than necessary
Two solutions:
 Region snapshots : Share only data files
 Async WAL Replication : Share data files, every region replica has its own in-memory data
Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Replication – Region Snapshots
Primary region works as usual
 Buffer up mutations in memstore
 Flush to disk when full
 Compact files when needed
 Deleted files are kept in archive directory for some time
Secondary regions
 Are opened in read-only mode
 Periodically look for new files in primary region
– When a new flushed file is seen, just open it and start serving data from there
– When a compaction is seen, open new file, close the files that are gone
 Good for read-only, bulk load data or less frequently updated data
RegionServerClient Send
mutation
Memstore
Memstore
Memstore
Memstore
Storefile
Flush
Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Updates in the RegionServer
Treats secondary replicas as read-only
Storefile management
 Keeps itself up-to-date with the changes to do with store file creation/deletions
Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
IPC layer high level flow
Client
YES
Response within
timeout (10 millis)?
NO Send READ to all
secondaries
Send READ to primary
Wait for response
Wait for response
Take the first
successful response;
cancel others
Similar flow for GET/Batch-
GET/Scan, except that Scan is
sticky to the server it sees
success from.
Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Performance and Testing
No significant performance issues discovered
 Added interrupt handling in the RPCs to cancel unneeded replica RPCs
Deeper level of performance testing work is still in progress
Tested with ChaosMonkey tests
 fails if response is not received within a certain time
Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Summary
Pros
 High-availability for read-only tables
 High-availability for stale reads
 Very low-latency for the above
 [No extra copies for the Storefiles]
Cons
 Increased blockcache usage
 Extra network traffic for the replica calls
 Increased number of regions to manage in the cluster
Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Next steps
What has been described so far is in “Phase-1” of the project
Phase-2
 WAL replication
– Will reduce the updates-seen lag drastically
 Handling of Merges and Splits
 Latency guarantees
– Cancellation of RPCs server side
– Promotion of one Secondary to Primary, and recruiting a new Secondary
Use the infrastructure to implement consensus protocols for read/write
within a single datacenter
RegionServerClient
Send
mutation
WAL
[Critical for
recovery]
MemstoreMemstoreMemstoreMemstore
Edit for region1
Edit for region1
Edit for region2
Edit for region3
Edit for region10
……
Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
References
Apache branch hbase-10070 (https://github.com/apache/hbase/tree/hbase-
10070)
Merge to mainline trunk is under discussion in the community
HDP-2.1 comes with experimental support for Phase-1
Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thanks
Q & A

More Related Content

What's hot

Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangLinux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Ceph Community
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprises
Nelson Calero
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
Yoshinori Matsunobu
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deployment
Yoshinori Matsunobu
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Danielle Womboldt
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
Cloudera, Inc.
 
Less07 storage
Less07 storageLess07 storage
Less07 storage
Amit Bhalla
 
Storage as a service and OpenStack Cinder
Storage as a service and OpenStack CinderStorage as a service and OpenStack Cinder
Storage as a service and OpenStack Cinder
openstackindia
 
Transparent Hugepages in RHEL 6
Transparent Hugepages in RHEL 6 Transparent Hugepages in RHEL 6
Transparent Hugepages in RHEL 6
Raghu Udiyar
 
Vce vxrail-customer-presentation new
Vce vxrail-customer-presentation newVce vxrail-customer-presentation new
Vce vxrail-customer-presentation new
Jennifer Graham
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
强 王
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
Italo Santos
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
Sage Weil
 
Db2 hadr commands
Db2 hadr commandsDb2 hadr commands
Db2 hadr commands
vnanaji
 
[カタログ]Veeam backup & replication エディション比較
[カタログ]Veeam backup & replication エディション比較[カタログ]Veeam backup & replication エディション比較
[カタログ]Veeam backup & replication エディション比較
株式会社クライム
 
하둡 맵리듀스 훑어보기
하둡 맵리듀스 훑어보기하둡 맵리듀스 훑어보기
하둡 맵리듀스 훑어보기
beom kyun choi
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
Simon Huang
 
Introduction to Linux Kernel
Introduction to Linux KernelIntroduction to Linux Kernel
Introduction to Linux Kernel
Stryker King
 

What's hot (20)

Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangLinux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprises
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deployment
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
Less07 storage
Less07 storageLess07 storage
Less07 storage
 
Storage as a service and OpenStack Cinder
Storage as a service and OpenStack CinderStorage as a service and OpenStack Cinder
Storage as a service and OpenStack Cinder
 
Transparent Hugepages in RHEL 6
Transparent Hugepages in RHEL 6 Transparent Hugepages in RHEL 6
Transparent Hugepages in RHEL 6
 
Vce vxrail-customer-presentation new
Vce vxrail-customer-presentation newVce vxrail-customer-presentation new
Vce vxrail-customer-presentation new
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 
Db2 hadr commands
Db2 hadr commandsDb2 hadr commands
Db2 hadr commands
 
[カタログ]Veeam backup & replication エディション比較
[カタログ]Veeam backup & replication エディション比較[カタログ]Veeam backup & replication エディション比較
[カタログ]Veeam backup & replication エディション比較
 
하둡 맵리듀스 훑어보기
하둡 맵리듀스 훑어보기하둡 맵리듀스 훑어보기
하둡 맵리듀스 훑어보기
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
 
Introduction to Linux Kernel
Introduction to Linux KernelIntroduction to Linux Kernel
Introduction to Linux Kernel
 

Viewers also liked

The Future of Hbase
The Future of HbaseThe Future of Hbase
The Future of Hbase
Salesforce Engineering
 
Apache hbase overview (20160427)
Apache hbase overview (20160427)Apache hbase overview (20160427)
Apache hbase overview (20160427)
Steve Min
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
enissoz
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Josh Elser
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
DataWorks Summit
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)
alexbaranau
 
Kelas tahun 5 rajin 2014 for merge
Kelas tahun 5 rajin  2014   for mergeKelas tahun 5 rajin  2014   for merge
Kelas tahun 5 rajin 2014 for merge
Siti Norwati
 
Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...
Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...
Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...
DevOpsDays Tel Aviv
 
Romanos 10 palavra
Romanos 10   palavraRomanos 10   palavra
Romanos 10 palavra
cantandocomahistoria
 
Evaluation
EvaluationEvaluation
Evaluation
Huntwah
 
Practica normalizacion
Practica normalizacionPractica normalizacion
Practica normalizacion
info162
 
Informativo de janeiro
Informativo de janeiroInformativo de janeiro
Informativo de janeiro
Lua Barros
 
JRuby on Rails Deployment: What They Didn't Tell You
JRuby on Rails Deployment: What They Didn't Tell YouJRuby on Rails Deployment: What They Didn't Tell You
JRuby on Rails Deployment: What They Didn't Tell You
elliando dias
 
Must taitung team
Must taitung teamMust taitung team
Must taitung team
MUSTHoover
 
Making a moodboard
Making a moodboardMaking a moodboard
Making a moodboard
chowders
 
รูปพื้นที่ผิว
รูปพื้นที่ผิวรูปพื้นที่ผิว
รูปพื้นที่ผิวKrueed Huaybong
 
Transactions introduction
Transactions introductionTransactions introduction
Ecommerce Trends to Watch Out for in 2015
Ecommerce Trends to Watch Out for in 2015Ecommerce Trends to Watch Out for in 2015
Ecommerce Trends to Watch Out for in 2015
Synotive - Software Development Company
 

Viewers also liked (20)

The Future of Hbase
The Future of HbaseThe Future of Hbase
The Future of Hbase
 
Apache hbase overview (20160427)
Apache hbase overview (20160427)Apache hbase overview (20160427)
Apache hbase overview (20160427)
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)
 
Kelas tahun 5 rajin 2014 for merge
Kelas tahun 5 rajin  2014   for mergeKelas tahun 5 rajin  2014   for merge
Kelas tahun 5 rajin 2014 for merge
 
Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...
Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...
Who Watches the Watchmen - Arup Chakrabarti, PagerDuty - DevOpsDays Tel Aviv ...
 
Romanos 10 palavra
Romanos 10   palavraRomanos 10   palavra
Romanos 10 palavra
 
Evaluation
EvaluationEvaluation
Evaluation
 
Practica normalizacion
Practica normalizacionPractica normalizacion
Practica normalizacion
 
Informativo de janeiro
Informativo de janeiroInformativo de janeiro
Informativo de janeiro
 
JRuby on Rails Deployment: What They Didn't Tell You
JRuby on Rails Deployment: What They Didn't Tell YouJRuby on Rails Deployment: What They Didn't Tell You
JRuby on Rails Deployment: What They Didn't Tell You
 
Must taitung team
Must taitung teamMust taitung team
Must taitung team
 
Making a moodboard
Making a moodboardMaking a moodboard
Making a moodboard
 
รูปพื้นที่ผิว
รูปพื้นที่ผิวรูปพื้นที่ผิว
รูปพื้นที่ผิว
 
Transactions introduction
Transactions introductionTransactions introduction
Transactions introduction
 
Ecommerce Trends to Watch Out for in 2015
Ecommerce Trends to Watch Out for in 2015Ecommerce Trends to Watch Out for in 2015
Ecommerce Trends to Watch Out for in 2015
 
Sensoplan
SensoplanSensoplan
Sensoplan
 

Similar to HBase Read High Availabilty using Timeline Consistent Region Replicas

HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
enissoz
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
DataWorks Summit/Hadoop Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
Josh Elser
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Chris Nauroth
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
DataWorks Summit
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Sankar H
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
DataWorks Summit
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
DataWorks Summit/Hadoop Summit
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
DataWorks Summit
 
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
Abdelkrim Hadjidj
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
alanfgates
 
Hadoop and HBase in the Real World
Hadoop and HBase in the Real WorldHadoop and HBase in the Real World
Hadoop and HBase in the Real World
Cloudera, Inc.
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
 

Similar to HBase Read High Availabilty using Timeline Consistent Region Replicas (20)

HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
 
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
Hadoop and HBase in the Real World
Hadoop and HBase in the Real WorldHadoop and HBase in the Real World
Hadoop and HBase in the Real World
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
Fwdays
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
Tobias Schneck
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
leebarnesutopia
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 

Recently uploaded (20)

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 

HBase Read High Availabilty using Timeline Consistent Region Replicas

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HBase Read High Availability Using Timeline-Consistent Region Replicas Enis Soztutar (enis@hortonworks.com) Devaraj Das (ddas@hortonworks.com) June 2014
  • 2. Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved About Us Enis Soztutar Committer and PMC member in Apache HBase and Hadoop since 2007 HBase team @Hortonworks Twitter @enissoz Devaraj Das Committer and PMC member in Hadoop since 2006 Committer at HBase Co-founder @Hortonworks Twitter @ddraj
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Outline of the talk PART I: Use case and semantics  CAP recap  Use case and motivation  Region replicas  Timeline consistency  Semantics PART II : Implementation and next steps  Server side  Client side  Data replication  Next steps & Summary
  • 4. Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Part I Use case and semantics
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved CAP reCAP Partition tolerance Consistency Availability Pick Two HBase is CP
  • 6. Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Availability CAP reCAP • In a distributed system you cannot NOT have P • C vs A is about what happens if there is a network partition! • A an C are NEVER binary values, always a range • Different operations in the system can have different A / C choices • HBase cannot be simplified as CP Partition tolerance Consistency Pick Two HBase is CP
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HBase consistency model For a single row, HBase is strongly consistent within a data center Across rows HBase is not strongly consistent (but available!). When a RS goes down, only the regions on that server become unavailable. Other regions are unaffected. HBase multi-DC replication is “eventual consistent” HBase applications should carefully design the schema for correct semantics / performance tradeoff
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Use cases and motivation More and more applications are looking for a “0 down time” platform  30 seconds downtime (aggressive MTTR time) is too much Certain classes of apps are willing to tolerate decreased consistency guarantees in favor of availability  Especially for READs Some build wrappers around the native API to be able to handle failures of destination servers  Multi-DC: when one server is down in one DC, the client switches to a different one Can we do something in HBase natively?  Within the same cluster?
  • 9. Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Use cases and motivation Designing the application requires careful tradeoff consideration  In schema design since single-row is strong consistent, but no multi-row trx  Multi-datacenter replication (active-passive, active-active, backups etc) It is good to be able to give the application flexibility to pick-and-choose  Higher availability vs stronger consistency Read vs Write  Different consistency models for read vs write  Read-repair, latest ts-wins vs linearizable updates
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Initial goals Support applications talking to a single cluster really well  No perceived downtime  Only for READs If apps wants to tolerate cluster failures  Use HBase replication  Combine that with wrappers in the application
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Introducing…. Region Replicas in HBase Timeline Consistency in HBase
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Region replicas For every region of the table, there can be more than one replica  Every region replica has an associated “replica_id”, starting from 0  Each region replica is hosted by a different region server Tables can be configured with a REGION_REPLICATION parameter  Default is 1  No change in the current behavior One replica per region is the “default” or “primary”  Only this can accepts WRITEs  All reads from this region replica return the most recent data Other replicas, also called “secondaries” follow the primary  They see only committed updates
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Region replicas Secondary region replicas are read-only  No writes are routed to secondary replicas  Data is replicated to secondary regions (more on this later)  Serve data from the same data files are primary  May not have received the recent data  Reads and Scans can be performed, returning possibly stale data Region replica placement is done to maximize availability of any particular region  Region replicas are not co-located on same region servers  And same racks (if possible)
  • 14. Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved rowkey column:value column:value … RegionServer Region memstore DataNode b2 b9 b1 DataNode b2 b1 DataNode b1 Client Read and write RegionServer
  • 15. Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Page 15 rowkey column:value column:value … RegionServer Region DataNode b2 b9 b1 DataNode b2 b1 DataNode b1 Client Read and write memstore RegionServer rowkey column:value column:value … memstore Region replica Read only
  • 16. Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Introduced a Consistency enum  STRONG  TIMELINE Consistency.STRONG is default Consistency can be set per read operation (per-get or per-scan) Timeline-consistent read RPCs sent to more than one replica Semantics is a bit different than Eventual Consistency model
  • 17. Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency public enum Consistency { STRONG, TIMELINE } Get get = new Get(row); get.setConsistency(Consistency.TIMELINE); ... Result result = table.get(get); … if (result.isStale()) { ... }
  • 18. Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Semantics Can be though of as in-cluster active-passive replication Single homed and ordered updates  All writes are handled and ordered by the primary region  All writes are STRONG consistency Secondaries apply the mutations in order Only get/scan requests to secondaries Get/Scan Result can be inspected to see whether the result was from possibly stale data
  • 19. Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication X=3 WAL Data: WAL Data: X=1X=1Write
  • 20. Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication X=3 WAL Data: WAL Data: X=1 X=1 X=1 X=1 X=1 X=1Read X=1Read X=1Read
  • 21. Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: Write X=1 X=1 X=2 X=2 X=2
  • 22. Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: X=2 X=1 X=2 X=2 X=2 X=2Read X=2Read X=1Read
  • 23. Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: X=2 X=1 X=3 X=2 Write X=3 X=3
  • 24. Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: X=2 X=1 X=3 X=2 X=3 X=3Read X=2Read X=1Read
  • 25. Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved PART II Implementation and next steps
  • 26. Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Region replicas – recap Every region replica has an associated “replica_id”, starting from 0 Each region replica is hosted by a different region server  All replicas can serve READs One replica per region is the “default” or “primary”  Only this can accepts WRITEs  All reads from this region replica return the most recent data
  • 27. Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Updates in the Master Replica creation  Created during table creation No distinction between primary & secondary replicas Meta table contain all information in one row Load balancer improvements  LB made aware of replicas  Does best effort to place replicas in machines/racks to maximize availability Alter table support  For adjusting number of replicas
  • 28. Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Replication Data should be replicated from primary regions to secondary regions Data files MUST be shared. We do not want to store multiple copies Do not cause more writes than necessary Two solutions:  Region snapshots : Share only data files  Async WAL Replication : Share data files, every region replica has its own in-memory data
  • 29. Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Replication – Region Snapshots Primary region works as usual  Buffer up mutations in memstore  Flush to disk when full  Compact files when needed  Deleted files are kept in archive directory for some time Secondary regions  Are opened in read-only mode  Periodically look for new files in primary region – When a new flushed file is seen, just open it and start serving data from there – When a compaction is seen, open new file, close the files that are gone  Good for read-only, bulk load data or less frequently updated data RegionServerClient Send mutation Memstore Memstore Memstore Memstore Storefile Flush
  • 30. Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Updates in the RegionServer Treats secondary replicas as read-only Storefile management  Keeps itself up-to-date with the changes to do with store file creation/deletions
  • 31. Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved IPC layer high level flow Client YES Response within timeout (10 millis)? NO Send READ to all secondaries Send READ to primary Wait for response Wait for response Take the first successful response; cancel others Similar flow for GET/Batch- GET/Scan, except that Scan is sticky to the server it sees success from.
  • 32. Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Performance and Testing No significant performance issues discovered  Added interrupt handling in the RPCs to cancel unneeded replica RPCs Deeper level of performance testing work is still in progress Tested with ChaosMonkey tests  fails if response is not received within a certain time
  • 33. Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Summary Pros  High-availability for read-only tables  High-availability for stale reads  Very low-latency for the above  [No extra copies for the Storefiles] Cons  Increased blockcache usage  Extra network traffic for the replica calls  Increased number of regions to manage in the cluster
  • 34. Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Next steps What has been described so far is in “Phase-1” of the project Phase-2  WAL replication – Will reduce the updates-seen lag drastically  Handling of Merges and Splits  Latency guarantees – Cancellation of RPCs server side – Promotion of one Secondary to Primary, and recruiting a new Secondary Use the infrastructure to implement consensus protocols for read/write within a single datacenter RegionServerClient Send mutation WAL [Critical for recovery] MemstoreMemstoreMemstoreMemstore Edit for region1 Edit for region1 Edit for region2 Edit for region3 Edit for region10 ……
  • 35. Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved References Apache branch hbase-10070 (https://github.com/apache/hbase/tree/hbase- 10070) Merge to mainline trunk is under discussion in the community HDP-2.1 comes with experimental support for Phase-1
  • 36. Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Thanks Q & A

Editor's Notes

  1. - What is hbase? - What is it good at? - How do you use it in my applications? Context, first principals
  2. Understand the world it lives in and it’s building blocks