HBase: Where Online Meets Low Latency

HBaseCon
HBase Low Latency
Nick Dimiduk, Hortonworks (@xefyr)
Nicolas Liochon, Scaled Risk (@nkeywal)
HBaseCon May 5, 2014
Agenda
• Latency, what is it, how to measure it
• Write path
• Read path
• Next steps
What’s low latency
Latency is about percentiles
• Long tail issue
• There are often order of magnitudes between « average » and « 95
percentile »
• Post 99% = « magical 1% ». Work in progress here.
• Meaning from micro seconds (High Frequency
Trading) to seconds (interactive queries)
• In this talk milliseconds
Measure latency – during test
bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation
• More options related to HBase: autoflush, replicas, …
• Latency measured in micro second
• Easier for internal analysis
• YCSB
• Useful for comparison between tools
• Set of workload already defined
Measure latency : Exposed by HBase
"QueueCallTime_num_ops" : 33044,
"QueueCallTime_min" : 0,
"QueueCallTime_max" : 86,
"QueueCallTime_mean" : 0.2525420651252875,
"QueueCallTime_median" : 0.0,
"QueueCallTime_75th_percentile" : 0.0,
"QueueCallTime_95th_percentile" : 1.0,
"QueueCallTime_99th_percentile" : 1.0,
a
"SyncTime_num_ops" : 379081,
"SyncTime_min" : 0,
"SyncTime_max" : 865,
"SyncTime_mean" : 3.0293341000999785,
"SyncTime_median" : 2.0,
"SyncTime_75th_percentile" : 3.0,
"SyncTime_95th_percentile" : 4.0,
"SyncTime_99th_percentile" : 253.5899999999999,
HBase write path – high level
RegionServer (HBase)
DataNode (Hadoop DFS)
HLog
(WAL)
HRegion
HStore
StoreFile
HFile
StoreFile
HFile
MemStore
...
...
HStore
BlockCache
HRegion
...
HStoreHStore
...
1
2
3
4
5
Deeper in the write path
• Two parts
• Single put (WAL)
• The client just sends the put
• Multiple puts from the client (new behavior since 0.96)
• The client is much smarter
• Four stages to look at for latency
• Start (establish tcp connections, etc.)
• Steady: when expected conditions are met
• Machine failure: expected as well
• Overloaded system: you may need to add machines or tune your workload
Single put: communication
• Create a « Call » object, with an id, as queries are multiplexed
• protobuf it
• tcp write (in trunk it can be queued for a separate thread as well)
• Wait for the answer
• Separate thread, separate queue
• unprotobuf the answer
• Implies locks and multiple threads communicating with queues
Single put: server side scheduling
• Threads to receives « Call »
• Threads to handle the call execution
• Threads to write the answer on the wire
• Multiple threads, communicating with queues
Single put: real work
• The server must
• Take a row lock (HBase strong consistency)
• Write into the WAL queue
• Write into the memstore
• Sync the queue (HDFS flush)
• Free the lock
• WALs queue is shared between all the regions/handlers
• Sync is avoided if another handlers did the work
• You may flush more than expected
Latency sources
• Candidate one: network
• 0.5ms within a datacenter.
• Candidate two: HDFS Flush
• Millisecond world: everything can go wrong
• Network
• OS Scheduler
• All this goes into the post 99% percentile
Metric Time in ms
Mean 0.33
50% 0.26
95% 0.59
99% 1.24
Latency sources
• Split (and presplits)
• Autosharding is great!
• Puts have to wait
• Impacts: seconds
• Balance
• Regions move
• Triggers a retry for the client
• hbase.client.pause = 100ms since HBase 0.96
• Garbage Collection
• Impacts: 10’s of ms, even with a good config
• Covered with the read path of this talk
From steady to loaded and oveloaded
• Number of concurrent tasks is a factor of
• Number of cores
• Number of disks
• Number of remote machines used
• Difficult to estimate
• Queues are doomed to happen
• So for low latency
• Specific Scheduler since Hbase 0.98 (HBASE-8884). Requires specific code.
• Priorities: work in progress.
Loaded & overloaded
• Step 1: Loaded system
• Tasks are queued: creates latency
• Specific metric in HBase
• Step 2: Limit reached
• MemStore takes too much room: blocks until it’s flushed
• hbase.regionserver.global.memstore.size.lower.limit
• hbase.regionserver.global.memstore.size
• hbase.hregion.memstore.block.multiplier
• Too many Hfiles: blocks until compations keeps up
• hbase.hstore.blockingStoreFiles
• Too many WALs files
• Don’t change this
Machine failure
• Failure
• Dectect
• Reallocate
• Replay WAL
• Replaying WAL is NOT required for puts
• Failure = Dectect + Reallocate + Retry
• That’s in the range of ~1s for simple failures
• Silent failures leads puts you in the 10s range if the hardware does not help
Single puts
• Millisecond range
• Spikes do happen in steady mode
• 100ms
• Causes: GC, load, splits
Streaming puts
Htable#setAutoFlushTo(false)
Htable#put
Htable#flushCommit
Streaming puts
• Write into a buffer
• When the buffer is full, in the background
• Select the puts that matches load conditions
• Send them
• Manage retries and delay
• The buffer is freed for other client operations
• Blocks only if there is an a not retryable error or if the buffer is full
Multiple puts
• hbase.client.max.total.tasks (default 100)
• hbase.client.max.perserver.tasks (default 5)
• hbase.client.max.perregion.tasks (default 1)
• Decouple the client from a latency peak of a region server
• Increase the throughput by 50%
• Does not solve the problem of an unbalanced cluster
• But makes split and GC more transparent
Conclusion on write path
• Single puts can be very fast
• It’s not a « hard real time » system: there are peaks
• Latency peaks can be hidden when streaming puts
• Including autosplits
And now for the read path
HBase read path – high level
RegionServer (HBase)
DataNode (Hadoop DFS)
HLog
(WAL)
HRegion
HStore
StoreFile
HFile
StoreFile
HFile
MemStore
...
...
HStore
BlockCache
HRegion
...
HStoreHStore
...
1 5
2
3
3
2
4
Deeper in the read path
• Get/short scan are assumed for low-latency operations
• Again, two APIs
• Single get: HTable#get(Get)
• Multi-get: HTable#get(List<Get>)
• Four stages, same as write path
• Start (tcp connection, …)
• Steady: when expected conditions are met
• Machine failure: expected as well
• Overloaded system: you may need to add machines or tune your workload
Multi get / Client
Multi get / Client
Group Gets by
RegionServer
Multi get / Client
Execute them
one by one
Multi get / Server
Multi get / Server
Access latency magnidesStorage hierarchy: a different view
Dean/2009
Memory is 100000x
faster than disk!
Disk seek = 10ms
Known unknowns
• For each candidate HFile
• Exclude by file metadata
• Timestamp
• Rowkey range
• Exclude by bloom filter
• StoreFileManager (0.96, HBASE-7678)
StoreFileScanner#
shouldUseScanner()
Unknown knowns
• Merge sort results polled from Stores
• Seek each scanner to a reference KeyValue
• Retrieve candidate data from disk
• Multiple HFiles => mulitple seeks
• hbase.storescanner.parallel.seek.enable=true
• Short Circuit Reads
• dfs.client.read.shortcircuit=true
• Block locality
• Happy clusters compact!
HFileBlock#
readBlockData()
Remembered knowns: BlockCache
• Reuse previously read data
• Smaller BLOCKSIZE => better utilization
• TODO: compression (HBASE-8894)
BlockCache#getBlock()
BlockCache Showdown
• LruBlockCache
• Quite good most of the time
• < 30 GB
• BucketCache
• Offheap alternative
• > 30 GB
http://www.n10k.com/blog/block
cache-showdown/
Latency enemies: Compactions
• Fewer HFiles => fewer seeks
• Evict data blocks!
• Evict Index blocks!!
• hfile.block.index.cacheonwrite
• Evict bloom blocks!!!
• hfile.block.bloom.cacheonwrite
• OS buffer cache to the rescue
• Compactected data is still fresh
• Better than going all the way back to disk
Latency enemies: Garbage Collection
• Use Heap. Not too much. With CMS.
• Max heap: 30GB, probably less
• Healthy cluster load
• regular, reliable collections
• 25-100ms pause on regular interval
• Overloaded RegionServer suffers GC overmuch
Off-heap to the rescue?
• BucketCache (0.96, HBASE-7404)
• Network interfaces (HBASE-9535)
• MemStore et al (HBASE-10191)
Failure
• Machine failure
• Detect + Reallocate + Replay
• Strong consistency requires replay
• Cache starts from scratch
Read latency in summary
• Steady mode
• Cache hit: < 1 ms
• Cache miss: + 10 ms per seek
• Writing while reading: cache churn
• GC: 25-100ms pause on regular interval
Network request + (1 - P(cache hit)) * 10 ms
• Same long tail issues as write
• Overloaded: same scheduling issues as write
• Partial failures hurt a lot
Hedging our bets
• HDFS Hedged reads (since HDFS 2.4)
• Strongly consistent
• Works at the HDFS level
• Timeline consistency (HBASE-10070)
• Reads on secondary regions
• If a region does not answer quickly enough, go
to another one
• Not strongly consistent
• Helps a lot latency for read path.
HBase ranges for 99% latency
Put Streamed Multiput Get Timeline get
Steady milliseconds milliseconds milliseconds milliseconds
Failure seconds seconds seconds milliseconds
GC
10’s of
milliseconds milliseconds
10’s of
milliseconds milliseconds
What’s next
• Less GC
• Use less objects
• Offheap
• Prefered location (HBASE-4755)
• The « magical 1% »
• Most tools stops at the 99% latency
• YCSB for example
• What happens after is much more complex
• But key to improve average
Thanks!
Nick Dimiduk, Hortonworks (@xefyr)
Nicolas Liochon, Scaled Risk (@nkeywal)
HBaseCon May 5, 2014
1 of 42

Recommended

Digital Library Collection Management using HBase by
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBaseHBaseCon
3.1K views23 slides
HBase: Extreme Makeover by
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
3.3K views67 slides
Meet HBase 1.0 by
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
8.2K views48 slides
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment by
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
4K views31 slides
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ... by
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
7.3K views50 slides
Off-heaping the Apache HBase Read Path by
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
4.2K views19 slides

More Related Content

What's hot

HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera by
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaCloudera, Inc.
8.7K views21 slides
HBase Low Latency, StrataNYC 2014 by
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014Nick Dimiduk
1.8K views47 slides
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More by
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
5K views35 slides
HBase Accelerated: In-Memory Flush and Compaction by
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionDataWorks Summit/Hadoop Summit
2.7K views24 slides
HBaseCon 2015: HBase Operations at Xiaomi by
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon
4.5K views35 slides
HBaseCon 2015: HBase Performance Tuning @ Salesforce by
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon
6.1K views54 slides

What's hot(20)

HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera by Cloudera, Inc.
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
Cloudera, Inc.8.7K views
HBase Low Latency, StrataNYC 2014 by Nick Dimiduk
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
Nick Dimiduk1.8K views
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More by Cloudera, Inc.
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
Cloudera, Inc.5K views
HBaseCon 2015: HBase Operations at Xiaomi by HBaseCon
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon4.5K views
HBaseCon 2015: HBase Performance Tuning @ Salesforce by HBaseCon
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon6.1K views
HBaseCon 2015: HBase 2.0 and Beyond Panel by HBaseCon
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon5.3K views
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce by Cloudera, Inc.
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
Cloudera, Inc.9.5K views
Apache HBase, Accelerated: In-Memory Flush and Compaction by HBaseCon
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
HBaseCon2.5K views
Usage case of HBase for real-time application by Edward Yoon
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time application
Edward Yoon2.8K views
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest by HBaseCon
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon646 views
Meet hbase 2.0 by enissoz
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
enissoz5.3K views
hbaseconasia2017: HBase在Hulu的使用和实践 by HBaseCon
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
HBaseCon878 views
HBase Blockcache 101 by Nick Dimiduk
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
Nick Dimiduk5.7K views
Real-time HBase: Lessons from the Cloud by HBaseCon
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
HBaseCon4.5K views
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket by Cloudera, Inc.
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
Cloudera, Inc.3.6K views
hbaseconasia2017: Large scale data near-line loading method and architecture by HBaseCon
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
HBaseCon598 views
HBase Advanced - Lars George by JAX London
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London9.9K views
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera by Cloudera, Inc.
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Cloudera, Inc.8.8K views

Viewers also liked

HBase for Architects by
HBase for ArchitectsHBase for Architects
HBase for ArchitectsNick Dimiduk
33.7K views21 slides
HBase Application Performance Improvement by
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
23.5K views25 slides
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce by
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
41.7K views189 slides
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! by
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! Sumeet Singh
965 views34 slides
Talend For Big Data : Secret Key to Hadoop by
Talend For Big Data  : Secret Key to HadoopTalend For Big Data  : Secret Key to Hadoop
Talend For Big Data : Secret Key to HadoopEdureka!
1.8K views28 slides
HBaseCon 2012 | Scaling GIS In Three Acts by
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
3.6K views14 slides

Viewers also liked(20)

HBase for Architects by Nick Dimiduk
HBase for ArchitectsHBase for Architects
HBase for Architects
Nick Dimiduk33.7K views
HBase Application Performance Improvement by Biju Nair
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
Biju Nair23.5K views
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce by Cloudera, Inc.
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
Cloudera, Inc.41.7K views
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! by Sumeet Singh
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo! HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
HBaseCon 2013: Multi-tenant Apache HBase at Yahoo!
Sumeet Singh965 views
Talend For Big Data : Secret Key to Hadoop by Edureka!
Talend For Big Data  : Secret Key to HadoopTalend For Big Data  : Secret Key to Hadoop
Talend For Big Data : Secret Key to Hadoop
Edureka!1.8K views
HBaseCon 2012 | Scaling GIS In Three Acts by Cloudera, Inc.
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
Cloudera, Inc.3.6K views
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase by Cloudera, Inc.
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
Cloudera, Inc.3.2K views
HBaseCon 2013: Rebuilding for Scale on Apache HBase by Cloudera, Inc.
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
Cloudera, Inc.3.9K views
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC by Cloudera, Inc.
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Cloudera, Inc.3.9K views
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo! by Cloudera, Inc.
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
Cloudera, Inc.3.2K views
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase by HBaseCon
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon3.3K views
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN by HBaseCon
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon2.9K views
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb... by Cloudera, Inc.
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
Cloudera, Inc.3.2K views
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon by Cloudera, Inc.
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
Cloudera, Inc.3.4K views
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,... by Cloudera, Inc.
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
Cloudera, Inc.3.8K views
HBase Read High Availability Using Timeline-Consistent Region Replicas by HBaseCon
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon4.1K views
Tales from the Cloudera Field by HBaseCon
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
HBaseCon4K views
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data... by Cloudera, Inc.
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
Cloudera, Inc.3.5K views
HBaseCon 2013: Apache HBase on Flash by Cloudera, Inc.
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
Cloudera, Inc.4.3K views
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second... by Cloudera, Inc.
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
Cloudera, Inc.4.2K views

Similar to HBase: Where Online Meets Low Latency

Large-scale Web Apps @ Pinterest by
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
4.1K views26 slides
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends by
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
564 views40 slides
Apache HBase Performance Tuning by
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
39.6K views54 slides
HBase in Practice by
HBase in Practice HBase in Practice
HBase in Practice DataWorks Summit/Hadoop Summit
5.4K views46 slides
HBase in Practice by
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
5.6K views47 slides
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends by
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
1.1K views40 slides

Similar to HBase: Where Online Meets Low Latency(20)

Large-scale Web Apps @ Pinterest by HBaseCon
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
HBaseCon4.1K views
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends by Esther Kundin
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin564 views
Apache HBase Performance Tuning by Lars Hofhansl
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl39.6K views
HBase in Practice by larsgeorge
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge5.6K views
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends by Esther Kundin
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin1.1K views
004 architecture andadvanceduse by Scott Miao
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
Scott Miao2.9K views
Elastic HBase on Mesos - HBaseCon 2015 by Cosmin Lehene
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
Cosmin Lehene12.8K views
HBaseCon 2015: Elastic HBase on Mesos by HBaseCon
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon3.1K views
Hbase schema design and sizing apache-con europe - nov 2012 by Chris Huang
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012
Chris Huang1K views
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop by Ayon Sinha
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Ayon Sinha2.6K views
HBase Tales From the Trenches - Short stories about most common HBase operati... by DataWorks Summit
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit1.8K views
Large-scale projects development (scaling LAMP) by Alexey Rybak
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
Alexey Rybak20.9K views
Intro to big data choco devday - 23-01-2014 by Hassan Islamov
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
Hassan Islamov690 views
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018) by Bob Pusateri
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Bob Pusateri418 views
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U... by Bob Pusateri
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Bob Pusateri57 views
HBase tales from the trenches by wchevreuil
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
wchevreuil123 views

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes by
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
3.9K views36 slides
hbaseconasia2017: HBase on Beam by
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
1.3K views26 slides
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei by
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
1.4K views21 slides
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest by
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
936 views42 slides
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程 by
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
1.1K views21 slides
hbaseconasia2017: Apache HBase at Netease by
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
754 views27 slides

More from HBaseCon(20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes by HBaseCon
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
HBaseCon3.9K views
hbaseconasia2017: HBase on Beam by HBaseCon
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
HBaseCon1.3K views
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei by HBaseCon
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
HBaseCon1.4K views
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest by HBaseCon
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon936 views
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程 by HBaseCon
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon1.1K views
hbaseconasia2017: Apache HBase at Netease by HBaseCon
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
HBaseCon754 views
hbaseconasia2017: 基于HBase的企业级大数据平台 by HBaseCon
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
HBaseCon701 views
hbaseconasia2017: HBase at JD.com by HBaseCon
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
HBaseCon828 views
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei by HBaseCon
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
HBaseCon683 views
hbaseconasia2017: HBase Practice At XiaoMi by HBaseCon
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
HBaseCon1.8K views
hbaseconasia2017: hbase-2.0.0 by HBaseCon
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
HBaseCon1.8K views
HBaseCon2017 Democratizing HBase by HBaseCon
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
HBaseCon897 views
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase by HBaseCon
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon608 views
HBaseCon2017 Transactions in HBase by HBaseCon
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
HBaseCon1.8K views
HBaseCon2017 Highly-Available HBase by HBaseCon
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
HBaseCon1.1K views
HBaseCon2017 Apache HBase at Didi by HBaseCon
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon996 views
HBaseCon2017 gohbase: Pure Go HBase Client by HBaseCon
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon1.7K views
HBaseCon2017 Improving HBase availability in a multi tenant environment by HBaseCon
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon1.2K views
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas... by HBaseCon
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon1.1K views
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase by HBaseCon
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon729 views

Recently uploaded

Electronic AWB - Electronic Air Waybill by
Electronic AWB - Electronic Air Waybill Electronic AWB - Electronic Air Waybill
Electronic AWB - Electronic Air Waybill Freightoscope
6 views1 slide
Page Object Model by
Page Object ModelPage Object Model
Page Object Modelartembondar5
7 views5 slides
Techstack Ltd at Slush 2023, Ukrainian delegation by
Techstack Ltd at Slush 2023, Ukrainian delegationTechstack Ltd at Slush 2023, Ukrainian delegation
Techstack Ltd at Slush 2023, Ukrainian delegationViktoriiaOpanasenko
7 views4 slides
Top-5-production-devconMunich-2023-v2.pptx by
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptxTier1 app
9 views42 slides
Mobile App Development Company by
Mobile App Development CompanyMobile App Development Company
Mobile App Development CompanyRichestsoft
5 views6 slides
Introduction to Git Source Control by
Introduction to Git Source ControlIntroduction to Git Source Control
Introduction to Git Source ControlJohn Valentino
8 views18 slides

Recently uploaded(20)

Electronic AWB - Electronic Air Waybill by Freightoscope
Electronic AWB - Electronic Air Waybill Electronic AWB - Electronic Air Waybill
Electronic AWB - Electronic Air Waybill
Freightoscope 6 views
Top-5-production-devconMunich-2023-v2.pptx by Tier1 app
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptx
Tier1 app9 views
Mobile App Development Company by Richestsoft
Mobile App Development CompanyMobile App Development Company
Mobile App Development Company
Richestsoft 5 views
Introduction to Git Source Control by John Valentino
Introduction to Git Source ControlIntroduction to Git Source Control
Introduction to Git Source Control
John Valentino8 views
ADDO_2022_CICID_Tom_Halpin.pdf by TomHalpin9
ADDO_2022_CICID_Tom_Halpin.pdfADDO_2022_CICID_Tom_Halpin.pdf
ADDO_2022_CICID_Tom_Halpin.pdf
TomHalpin96 views
Transport Management System - Shipment & Container Tracking by Freightoscope
Transport Management System - Shipment & Container TrackingTransport Management System - Shipment & Container Tracking
Transport Management System - Shipment & Container Tracking
Freightoscope 6 views
Dapr Unleashed: Accelerating Microservice Development by Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski16 views
Automated Testing of Microsoft Power BI Reports by RTTS
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI Reports
RTTS11 views
Ports-and-Adapters Architecture for Embedded HMI by Burkhard Stubert
Ports-and-Adapters Architecture for Embedded HMIPorts-and-Adapters Architecture for Embedded HMI
Ports-and-Adapters Architecture for Embedded HMI
Burkhard Stubert35 views
Supercharging your Python Development Environment with VS Code and Dev Contai... by Dawn Wages
Supercharging your Python Development Environment with VS Code and Dev Contai...Supercharging your Python Development Environment with VS Code and Dev Contai...
Supercharging your Python Development Environment with VS Code and Dev Contai...
Dawn Wages5 views
Streamlining Your Business Operations with Enterprise Application Integration... by Flexsin
Streamlining Your Business Operations with Enterprise Application Integration...Streamlining Your Business Operations with Enterprise Application Integration...
Streamlining Your Business Operations with Enterprise Application Integration...
Flexsin 5 views

HBase: Where Online Meets Low Latency

  • 1. HBase Low Latency Nick Dimiduk, Hortonworks (@xefyr) Nicolas Liochon, Scaled Risk (@nkeywal) HBaseCon May 5, 2014
  • 2. Agenda • Latency, what is it, how to measure it • Write path • Read path • Next steps
  • 3. What’s low latency Latency is about percentiles • Long tail issue • There are often order of magnitudes between « average » and « 95 percentile » • Post 99% = « magical 1% ». Work in progress here. • Meaning from micro seconds (High Frequency Trading) to seconds (interactive queries) • In this talk milliseconds
  • 4. Measure latency – during test bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation • More options related to HBase: autoflush, replicas, … • Latency measured in micro second • Easier for internal analysis • YCSB • Useful for comparison between tools • Set of workload already defined
  • 5. Measure latency : Exposed by HBase "QueueCallTime_num_ops" : 33044, "QueueCallTime_min" : 0, "QueueCallTime_max" : 86, "QueueCallTime_mean" : 0.2525420651252875, "QueueCallTime_median" : 0.0, "QueueCallTime_75th_percentile" : 0.0, "QueueCallTime_95th_percentile" : 1.0, "QueueCallTime_99th_percentile" : 1.0, a "SyncTime_num_ops" : 379081, "SyncTime_min" : 0, "SyncTime_max" : 865, "SyncTime_mean" : 3.0293341000999785, "SyncTime_median" : 2.0, "SyncTime_75th_percentile" : 3.0, "SyncTime_95th_percentile" : 4.0, "SyncTime_99th_percentile" : 253.5899999999999,
  • 6. HBase write path – high level RegionServer (HBase) DataNode (Hadoop DFS) HLog (WAL) HRegion HStore StoreFile HFile StoreFile HFile MemStore ... ... HStore BlockCache HRegion ... HStoreHStore ... 1 2 3 4 5
  • 7. Deeper in the write path • Two parts • Single put (WAL) • The client just sends the put • Multiple puts from the client (new behavior since 0.96) • The client is much smarter • Four stages to look at for latency • Start (establish tcp connections, etc.) • Steady: when expected conditions are met • Machine failure: expected as well • Overloaded system: you may need to add machines or tune your workload
  • 8. Single put: communication • Create a « Call » object, with an id, as queries are multiplexed • protobuf it • tcp write (in trunk it can be queued for a separate thread as well) • Wait for the answer • Separate thread, separate queue • unprotobuf the answer • Implies locks and multiple threads communicating with queues
  • 9. Single put: server side scheduling • Threads to receives « Call » • Threads to handle the call execution • Threads to write the answer on the wire • Multiple threads, communicating with queues
  • 10. Single put: real work • The server must • Take a row lock (HBase strong consistency) • Write into the WAL queue • Write into the memstore • Sync the queue (HDFS flush) • Free the lock • WALs queue is shared between all the regions/handlers • Sync is avoided if another handlers did the work • You may flush more than expected
  • 11. Latency sources • Candidate one: network • 0.5ms within a datacenter. • Candidate two: HDFS Flush • Millisecond world: everything can go wrong • Network • OS Scheduler • All this goes into the post 99% percentile Metric Time in ms Mean 0.33 50% 0.26 95% 0.59 99% 1.24
  • 12. Latency sources • Split (and presplits) • Autosharding is great! • Puts have to wait • Impacts: seconds • Balance • Regions move • Triggers a retry for the client • hbase.client.pause = 100ms since HBase 0.96 • Garbage Collection • Impacts: 10’s of ms, even with a good config • Covered with the read path of this talk
  • 13. From steady to loaded and oveloaded • Number of concurrent tasks is a factor of • Number of cores • Number of disks • Number of remote machines used • Difficult to estimate • Queues are doomed to happen • So for low latency • Specific Scheduler since Hbase 0.98 (HBASE-8884). Requires specific code. • Priorities: work in progress.
  • 14. Loaded & overloaded • Step 1: Loaded system • Tasks are queued: creates latency • Specific metric in HBase • Step 2: Limit reached • MemStore takes too much room: blocks until it’s flushed • hbase.regionserver.global.memstore.size.lower.limit • hbase.regionserver.global.memstore.size • hbase.hregion.memstore.block.multiplier • Too many Hfiles: blocks until compations keeps up • hbase.hstore.blockingStoreFiles • Too many WALs files • Don’t change this
  • 15. Machine failure • Failure • Dectect • Reallocate • Replay WAL • Replaying WAL is NOT required for puts • Failure = Dectect + Reallocate + Retry • That’s in the range of ~1s for simple failures • Silent failures leads puts you in the 10s range if the hardware does not help
  • 16. Single puts • Millisecond range • Spikes do happen in steady mode • 100ms • Causes: GC, load, splits
  • 18. Streaming puts • Write into a buffer • When the buffer is full, in the background • Select the puts that matches load conditions • Send them • Manage retries and delay • The buffer is freed for other client operations • Blocks only if there is an a not retryable error or if the buffer is full
  • 19. Multiple puts • hbase.client.max.total.tasks (default 100) • hbase.client.max.perserver.tasks (default 5) • hbase.client.max.perregion.tasks (default 1) • Decouple the client from a latency peak of a region server • Increase the throughput by 50% • Does not solve the problem of an unbalanced cluster • But makes split and GC more transparent
  • 20. Conclusion on write path • Single puts can be very fast • It’s not a « hard real time » system: there are peaks • Latency peaks can be hidden when streaming puts • Including autosplits
  • 21. And now for the read path
  • 22. HBase read path – high level RegionServer (HBase) DataNode (Hadoop DFS) HLog (WAL) HRegion HStore StoreFile HFile StoreFile HFile MemStore ... ... HStore BlockCache HRegion ... HStoreHStore ... 1 5 2 3 3 2 4
  • 23. Deeper in the read path • Get/short scan are assumed for low-latency operations • Again, two APIs • Single get: HTable#get(Get) • Multi-get: HTable#get(List<Get>) • Four stages, same as write path • Start (tcp connection, …) • Steady: when expected conditions are met • Machine failure: expected as well • Overloaded system: you may need to add machines or tune your workload
  • 24. Multi get / Client
  • 25. Multi get / Client Group Gets by RegionServer
  • 26. Multi get / Client Execute them one by one
  • 27. Multi get / Server
  • 28. Multi get / Server
  • 29. Access latency magnidesStorage hierarchy: a different view Dean/2009 Memory is 100000x faster than disk! Disk seek = 10ms
  • 30. Known unknowns • For each candidate HFile • Exclude by file metadata • Timestamp • Rowkey range • Exclude by bloom filter • StoreFileManager (0.96, HBASE-7678) StoreFileScanner# shouldUseScanner()
  • 31. Unknown knowns • Merge sort results polled from Stores • Seek each scanner to a reference KeyValue • Retrieve candidate data from disk • Multiple HFiles => mulitple seeks • hbase.storescanner.parallel.seek.enable=true • Short Circuit Reads • dfs.client.read.shortcircuit=true • Block locality • Happy clusters compact! HFileBlock# readBlockData()
  • 32. Remembered knowns: BlockCache • Reuse previously read data • Smaller BLOCKSIZE => better utilization • TODO: compression (HBASE-8894) BlockCache#getBlock()
  • 33. BlockCache Showdown • LruBlockCache • Quite good most of the time • < 30 GB • BucketCache • Offheap alternative • > 30 GB http://www.n10k.com/blog/block cache-showdown/
  • 34. Latency enemies: Compactions • Fewer HFiles => fewer seeks • Evict data blocks! • Evict Index blocks!! • hfile.block.index.cacheonwrite • Evict bloom blocks!!! • hfile.block.bloom.cacheonwrite • OS buffer cache to the rescue • Compactected data is still fresh • Better than going all the way back to disk
  • 35. Latency enemies: Garbage Collection • Use Heap. Not too much. With CMS. • Max heap: 30GB, probably less • Healthy cluster load • regular, reliable collections • 25-100ms pause on regular interval • Overloaded RegionServer suffers GC overmuch
  • 36. Off-heap to the rescue? • BucketCache (0.96, HBASE-7404) • Network interfaces (HBASE-9535) • MemStore et al (HBASE-10191)
  • 37. Failure • Machine failure • Detect + Reallocate + Replay • Strong consistency requires replay • Cache starts from scratch
  • 38. Read latency in summary • Steady mode • Cache hit: < 1 ms • Cache miss: + 10 ms per seek • Writing while reading: cache churn • GC: 25-100ms pause on regular interval Network request + (1 - P(cache hit)) * 10 ms • Same long tail issues as write • Overloaded: same scheduling issues as write • Partial failures hurt a lot
  • 39. Hedging our bets • HDFS Hedged reads (since HDFS 2.4) • Strongly consistent • Works at the HDFS level • Timeline consistency (HBASE-10070) • Reads on secondary regions • If a region does not answer quickly enough, go to another one • Not strongly consistent • Helps a lot latency for read path.
  • 40. HBase ranges for 99% latency Put Streamed Multiput Get Timeline get Steady milliseconds milliseconds milliseconds milliseconds Failure seconds seconds seconds milliseconds GC 10’s of milliseconds milliseconds 10’s of milliseconds milliseconds
  • 41. What’s next • Less GC • Use less objects • Offheap • Prefered location (HBASE-4755) • The « magical 1% » • Most tools stops at the 99% latency • YCSB for example • What happens after is much more complex • But key to improve average
  • 42. Thanks! Nick Dimiduk, Hortonworks (@xefyr) Nicolas Liochon, Scaled Risk (@nkeywal) HBaseCon May 5, 2014

Editor's Notes

  1. Recap – refresher on the read path Request received Scanners created over memstore and store files Data blocks read from cache or disk as appropriate Results merged by the region Response sent back to client
  2. Goal: avoid disk at all cost!
  3. Goal: don’t go to disk unless absolutely necessary. Tactic: Candidate HFile elimination. Regular compactions => only 3 files to seek Alternative: StoreFileManager cleverness
  4. Necessary for fewer hfiles and fewer seeks IO resource contention Buffer cache to the rescue