SlideShare a Scribd company logo
1 of 67
HBase: Extreme makeover
Vladimir Rodionov
Hadoop/HBase architect
Founder of BigBase.org
HBaseCon 2014
Features & Internal Track
Agenda
About myself
• Principal Platform Engineer @Carrier IQ, Sunnyvale, CA
• Prior to Carrier IQ, I worked @ GE, EBay, Plumtree/BEA.
• HBase user since 2009.
• HBase hacker since 2013.
• Areas of expertise include (but not limited to) Java,
HBase, Hadoop, Hive, large-scale OLAP/Analytics, and in-
memory data processing.
• Founder of BigBase.org
What?
BigBase = EM(HBase)
BigBase = EM(HBase)
EM(*) = ?
BigBase = EM(HBase)
EM(*) =
BigBase = EM(HBase)
EM(*) =
Seriously?
BigBase = EM(HBase)
EM(*) =
Seriously?
for HBase
It’s a Multi-Level Caching solution
Real Agenda
• Why BigBase?
• Brief history of BigBase.org project
• BigBase MLC high level architecture (L1/L2/L3)
• Level 1 - Row Cache.
• Level 2/3 - Block Cache RAM/SSD.
• YCSB benchmark results
• Upcoming features in R1.5, 2.0, 3.0.
• Q&A
HBase
• Still lacks some original BigTable’s features.
• Still not able to utilize efficiently all RAM.
• No good mixed storage (SSD/HDD) support.
• Single Level Caching only. Simple.
• HBase + Large JVM Heap (MemStore) = ?
BigBase
• Adds Row Cache and block cache compression.
• Utilizes efficiently all RAM (TBs).
• Supports mixed storage (SSD/HDD).
• Has Multi Level Caching. Not that simple.
• Will move MemStore off heap in R2.
BigBase History
Koda (2010)
• Koda - Java off heap object cache, similar to
Terracotta’s BigMemory.
• Delivers 4x times more transactions …
• 10x times better latencies than BigMemory 4.
• Compression (Snappy, LZ4, LZ4HC, Deflate).
• Disk persistence and periodic cache snapshots.
• Tested up to 240GB.
Karma (2011-12)
• Karma - Java off heap BTree implementation to
support fast in memory queries.
• Supports extra large heaps, 100s millions – billions
objects.
• Stores 300M objects in less than 10G of RAM.
• Block Compression.
• Tested up to 240GB.
• Off Heap MemStore in R2.
Yamm (2013)
• Yet Another Memory Manager.
– Pure 100% Java memory allocator.
– Replaced jemalloc in Koda.
– Now Koda is 100% Java.
– Karma is the next (still on jemalloc).
– Similar to memcached slab allocator.
• BigBase project started (Summer 2013).
BigBase Architecture
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
HBase 0.96
Disk
JVMRAM
Bucket cache
One level of caching :
• RAM (L2)
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
HBase 0.96
Bucket cache
JVMRAM
One level of caching :
• RAM (L2)
• Or DISK (L3)
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
HBase 0.96
Disk
JVMRAM
Bucket cache
BigBase 1.0
Block Cache L3
SSD
JVMRAM
Row Cache L1
Block Cache L2
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
HBase 0.96
Disk
JVMRAM
Bucket cache
BigBase 1.0
JVMRAM
Row Cache L1
Block Cache L2
BlockCache L3
Network
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
HBase 0.96
Disk
JVMRAM
Bucket cache
BigBase 1.0
JVMRAM
Row Cache L1
Block Cache L2
BlockCache L3
memcached
MLC – Multi-Level Caching
HBase 0.94
Disk
JVMRAM
LRUBlockCache
HBase 0.96
Disk
JVMRAM
Bucket cache
BigBase 1.0
JVMRAM
Row Cache L1
Block Cache L2
BlockCache L3
DynamoDB
BigBase Row Cache (L1)
Where is BigTable’s Scan Cache?
• Scan Cache caches hot rows data.
• Complimentary to Block Cache.
• Still missing in HBase (as of 0.98).
• It’s very hard to implement in Java (off heap).
• Max GC pause is ~ 0.5-2 sec per 1GB of heap
• G1 GC in Java 7 does not resolve the problem.
• We call it Row Cache in BigBase.
Row Cache vs. Block Cache
HFile Block HFile BlockHFile BlockHFile BlockHFile Block
Row Cache vs. Block Cache
Row Cache vs. Block Cache
BLOCK CACHE
ROW CACHE
Row Cache vs. Block Cache
ROW CACHE
BLOCK CACHE
Row Cache vs. Block Cache
ROW CACHE
BLOCK CACHE
BigBase Row Cache
• Off Heap Scan Cache for HBase.
• Cache size: 100’s of GBs to TBs.
• Eviction policies: LRU, LFU, FIFO,
Random.
• Pure 100% - compatible Java.
• Sub-millisecond latencies, zero GC.
• Implemented as RegionObserver
coprocessor.
Row Cache
YAMM Codecs
Kryo
SerDe
KODA
BigBase Row Cache
• Read through cache.
• It caches rowkey:CF.
• Invalidates key on every mutation.
• Can be enabled/disabled per table
and per table:CF.
• New ROWCACHE attribute.
• Best for small rows (< block size)
Row Cache
YAMM Codecs
Kryo
SerDe
KODA
Performance-Scalability
• GET (small rows < 100 bytes): 175K operations per sec
per one Region Server (from cache).
• MULTI-GET (small rows < 100 bytes): > 1M records per
second (network limited) per one Region Server.
• LATENCY : 99% < 1ms (for GETs) with 100K ops.
• Vertical scalability: tested up to 240GB (the maximum
available in Amazon EC2).
• Horizontal scalability: limited by HBase scalability.
• No more memcached farms in front of HBase clusters.
BigBase Block Cache (L2, L3)
What is wrong with Bucket Cache?
Scalability LIMITED
Multi-Level Caching (MLC) NOT SUPPORTED
Persistence (‘offheap’ mode) NOT SUPPORTED
Low latency apps NOT SUPPORTED
SSD friendliness (‘file’ mode) NOT FRIENDLY
Compression NOT SUPPORTED
What is wrong with Bucket Cache?
Scalability LIMITED
Multi-Level Caching (MLC) NOT SUPPORTED
Persistence (‘offheap’ mode) NOT SUPPORTED
Low latency apps NOT SUPPORTED
SSD friendliness (‘file’ mode) NOT FRIENDLY
Compression NOT SUPPORTED
What is wrong with Bucket Cache?
Scalability LIMITED
Multi-Level Caching (MLC) NOT SUPPORTED
Persistence (‘offheap’ mode) NOT SUPPORTED
Low latency apps NOT SUPPORTED
SSD friendliness (‘file’ mode) NOT FRIENDLY
Compression NOT SUPPORTED
What is wrong with Bucket Cache?
Scalability LIMITED
Multi-Level Caching (MLC) NOT SUPPORTED
Persistence (‘offheap’ mode) NOT SUPPORTED
Low latency apps ?
SSD friendliness (‘file’ mode) NOT FRIENDLY
Compression NOT SUPPORTED
What is wrong with Bucket Cache?
Scalability LIMITED
Multi-Level Caching (MLC) NOT SUPPORTED
Persistence (‘offheap’ mode) NOT SUPPORTED
Low latency apps NOT SUPPORTED
SSD friendliness (‘file’ mode) NOT FRIENDLY
Compression NOT SUPPORTED
What is wrong with Bucket Cache?
Scalability LIMITED
Multi-Level Caching (MLC) NOT SUPPORTED
Persistence (‘offheap’ mode) NOT SUPPORTED
Low latency apps NOT SUPPORTED
SSD friendliness (‘file’ mode) NOT FRIENDLY
Compression NOT SUPPORTED
Here comes BigBase
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Here comes BigBase
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Here comes BigBase
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Here comes BigBase
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Here comes BigBase
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Here comes BigBase
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Wait, there are more …
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Non disk–based L3 cache SUPPORTED
RAM Cache optimization IBCO
Wait, there are more …
Scalability HIGH
Multi-Level Caching (MLC) SUPPORTED
Persistence (‘offheap’ mode) SUPPORTED
Low latency apps SUPPORTED
SSD friendliness (‘file’ mode) SSD-FRIENDLY
Compression SNAPPY, LZ4, LZHC, DEFLATE
Non disk–based L3 cache SUPPORTED
RAM Cache optimization IBCO
BigBase 1.0 vs. HBase 0.98
BigBase HBase 0.98
Row Cache (L1) YES NO
Block Cache RAM (L2) YES (fully off heap) YES (partially off heap)
Block Cache (L3) DISK YES (SSD- friendly) YES (not SSD – friendly)
Block Cache (L3) NON DISK YES NO
Compression YES NO
RAM Cache persistence YES (both L1 and L2) NO
Low Latency optimized YES NO
MLC support YES (L1, L2, L3) NO (either L2 or L3)
Scalability HIGH MEDIUM (limited by JVM heap)
YCSB Benchmark
Test setup (AWS)
• HBase 0.94.15 – RS: 11.5GB heap (6GB LruBlockCache on heap); Master: 4GB heap.
• Clients: 5 (30 threads each), collocated with Region Servers.
• Data sets: 100M and 200M. 120GB / 240GB approximately. Only 25% fits in a cache.
• Workloads: 100% read (read100, read200, hotspot100), 100% scan (scan100, scan200) –zipfian.
• YCSB 0.1.4 (modified to generate compressible data). We generated compressible data (with
factor of 2.5x) only for scan workloads to evaluate effect of compression in BigBase block
cache implementation.
• Common – Whirr 0.8.2; 1 (Master + Zk) + 5 RS; m1.xlarge: 15GB RAM, 4 vCPU, 4x420 HDD
• BigBase 1.0 (0.94.15) – RS: 4GB heap (6GB off heap cache); Master: 4GB heap.
• HBase 0.96.2 – RS: 4GB heap (6GB Bucket Cache off heap); Master: 4GB heap.
Test setup (AWS)
• HBase 0.94.15 – RS: 11.5GB heap (6GB LruBlockCache on heap); Master: 4GB heap.
• Clients: 5 (30 threads each), collocated with Region Servers.
• Data sets: 100M and 200M. 120GB / 240GB approximately. Only 25% fits in a cache.
• Workloads: 100% read (read100, read200, hotspot100), 100% scan (scan100, scan200) –zipfian.
• YCSB 0.1.4 (modified to generate compressible data). We generated compressible data (with
factor of 2.5x) only for scan workloads to evaluate effect of compression in BigBase block
cache implementation.
• Common – Whirr 0.8.2; 1 (Master + Zk) + 5 RS; m1.xlarge: 15GB RAM, 4 vCPU, 4x420 HDD
• BigBase 1.0 (0.94.15) – RS: 4GB heap (6GB off heap cache); Master: 4GB heap.
• HBase 0.96.2 – RS: 4GB heap (6GB Bucket Cache off heap); Master: 4GB heap.
Benchmark results (RPS)
11405
6123
5553
6265
4086 3850
15150
3512
28553224
1500
709820 434 228
0
2000
4000
6000
8000
10000
12000
14000
16000
BigBase R1.0 HBase 0.96.2 HBase 0.94.15
read100
read200
hotspot100
scan100
scan200
Average latency (ms)
13 24 2723 36 3910
44 5248
102
223
187
375
700
0
100
200
300
400
500
600
700
800
BigBase R1.0 HBase 0.96.2 HBase 0.94.15
read100
read200
hotspot100
scan100
scan200
95% latency (ms)
51
91 10088 124 138
38
152
197175
405
950
729
0
100
200
300
400
500
600
700
800
900
1000
BigBase R1.0 HBase 0.96.2 HBase 0.94.15
read100
read200
hotspot100
scan100
scan200
99% latency (ms)
133
190 213225
304
338
111
554
632
367
811
0
100
200
300
400
500
600
700
800
900
BigBase R1.0 HBase 0.96.2 HBase 0.94.15
read100
read200
hotspot100
scan100
scan200
YCSB 100% Read
3621
1308
2281
11111253
770
0
500
1000
1500
2000
2500
3000
3500
4000
BigBase R1.0 HBase 0.94.15
Per Server
50M 100M 200M
• 50M = 2.77X
• 100M = 2.05X
• 200M = 1.63X
• 50M = 40% fits cache
• 100M = 20% fits cache
• 200M = 10% fits cache
• What is the maximum?
YCSB 100% Read
3621
1308
2281
11111253
770
0
500
1000
1500
2000
2500
3000
3500
4000
BigBase R1.0 HBase 0.94.15
Per Server
50M 100M 200M
• 50M = 2.77X
• 100M = 2.05X
• 200M = 1.63X
• 50M = 40% fits cache
• 100M = 20% fits cache
• 200M = 10% fits cache
• What is the maximum?
• ~ 75X (hotspot 2.5/100)
• 56K (BB) vs. 750 (HBase)
• 100% in cache
All data in cache
• Setup: BigBase 1.0, 48G
RAM, (8/16) CPU cores –
5 nodes (1+ 4)
• Data set: 200M (300GB)
• Test: Read 100%, hotspot
• YCSB 0.1.4 – 4 clients
• 40 threads – 100K
• 100 threads – 168K
• 200 threads – 224K
• 400 threads - 262K
100,000 168,000 224,000 262,000
99% 1 2 3 7
95% 1 1 2 3
avg 0.4 0.6 0.9 1.5
0
1
2
3
4
5
6
7
8
Latency(ms)
Hotspot (2.5/100 – 200M data)
All data in cache
• Setup: BigBase 1.0, 48G
RAM, (8/16) CPU cores –
5 nodes (1+ 4)
• Data set: 200M (300GB)
• Test: Read 100%, hotspot
• YCSB 0.1.4 – 4 clients
• 40 threads – 100K
• 100 threads – 168K
• 200 threads – 224K
• 400 threads - 262K
100,000 168,000 224,000 262,000
99% 1 2 3 7
95% 1 1 2 3
avg 0.4 0.6 0.9 1.5
0
1
2
3
4
5
6
7
8
Latency(ms)
Hotspot (2.5/100 – 200M data)
100K ops: 99% < 1ms
What is next?
• Release 1.1 (2014 Q2)
– Support HBase 0.96, 0.98, trunk
– Fully tested L3 cache (SSD)
• Release 1.5 (2014 Q3)
– YAMM: memory allocator compacting mode .
– Integration with Hadoop metrics.
– Row Cache: merge rows on update (good for counters).
– Block Cache: new eviction policy (LRU-2Q).
– File read posix_fadvise ( bypass OS page cache).
– Row Cache: make it available for server-side apps
What is next?
• Release 2.0 (2014 Q3)
– HBASE-5263: Preserving cache data on compaction
– Cache data blocks on memstore flush (configurable).
– HBASE-10648: Pluggable Memstore. Off heap
implementation, based on Karma (off heap BTree lib).
• Release 3.0 (2014 Q4)
– Real Scan Cache – caches results of Scan operations on
immutable store files.
– Scan Cache integration with Phoenix and with other 3rd
party libs provided rich query API for HBase.
Download/Install/Uninstall
• Download BigBase 1.0 from www.bigbase.org
• Installation/upgrade takes 10-20 minutes
• Beatification operator EM(*) is invertible:
HBase = EM-1(BigBase) (the same 10-20 min)
Q & A
Vladimir Rodionov
Hadoop/HBase architect
Founder of BigBase.org
HBase: Extreme makeover
Features & Internal Track

More Related Content

What's hot

HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBaseCon
 
Rigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceRigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceCloudera, Inc.
 
Varrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentationVarrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentationpittmantony
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014larsgeorge
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
Memcached Code Camp 2009
Memcached Code Camp 2009Memcached Code Camp 2009
Memcached Code Camp 2009NorthScale
 
Vm13 vnx mixed workloads
Vm13 vnx mixed workloadsVm13 vnx mixed workloads
Vm13 vnx mixed workloadspittmantony
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)Yury Kaliaha
 
Moxi - Memcached Proxy
Moxi - Memcached ProxyMoxi - Memcached Proxy
Moxi - Memcached ProxyNorthScale
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon
 
Scaling Up and Out your Virtualized SQL Servers
Scaling Up and Out your Virtualized SQL ServersScaling Up and Out your Virtualized SQL Servers
Scaling Up and Out your Virtualized SQL Serversheraflux
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory HBaseCon
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
Caching with Memcached and APC
Caching with Memcached and APCCaching with Memcached and APC
Caching with Memcached and APCBen Ramsey
 
Implementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedImplementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedGear6
 
The State of HBase Replication
The State of HBase ReplicationThe State of HBase Replication
The State of HBase ReplicationHBaseCon
 

What's hot (20)

HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
Rigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceRigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase Performance
 
Varrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentationVarrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentation
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
Memcached Code Camp 2009
Memcached Code Camp 2009Memcached Code Camp 2009
Memcached Code Camp 2009
 
Vm13 vnx mixed workloads
Vm13 vnx mixed workloadsVm13 vnx mixed workloads
Vm13 vnx mixed workloads
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)
 
Moxi - Memcached Proxy
Moxi - Memcached ProxyMoxi - Memcached Proxy
Moxi - Memcached Proxy
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
Scaling Up and Out your Virtualized SQL Servers
Scaling Up and Out your Virtualized SQL ServersScaling Up and Out your Virtualized SQL Servers
Scaling Up and Out your Virtualized SQL Servers
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Caching with Memcached and APC
Caching with Memcached and APCCaching with Memcached and APC
Caching with Memcached and APC
 
Implementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedImplementing High Availability Caching with Memcached
Implementing High Availability Caching with Memcached
 
The State of HBase Replication
The State of HBase ReplicationThe State of HBase Replication
The State of HBase Replication
 

Viewers also liked

Real-time Analytics with HBase (short version)
Real-time Analytics with HBase (short version)Real-time Analytics with HBase (short version)
Real-time Analytics with HBase (short version)alexbaranau
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBasedave_revell
 
Apache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesApache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesData Con LA
 
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...DataWorks Summit/Hadoop Summit
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaselarsgeorge
 
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...Cloudera, Inc.
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini Cloudera, Inc.
 

Viewers also liked (9)

Real-time Analytics with HBase (short version)
Real-time Analytics with HBase (short version)Real-time Analytics with HBase (short version)
Real-time Analytics with HBase (short version)
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBase
 
Apache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesApache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use Cases
 
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBase
 
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
 
Spark + HBase
Spark + HBase Spark + HBase
Spark + HBase
 

Similar to HBase: Extreme makeover - BigBase multi-level caching architecture and performance

Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
V mware virtual san 5.5 deep dive
V mware virtual san 5.5 deep diveV mware virtual san 5.5 deep dive
V mware virtual san 5.5 deep divesolarisyougood
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelColleen Corrice
 
Ceph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelCeph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelRed_Hat_Storage
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheDavid Grier
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red_Hat_Storage
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best PracticesVenu Anuganti
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Javasunnygleason
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Community
 
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...In-Memory Computing Summit
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trencheswchevreuil
 
VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014David Davis
 

Similar to HBase: Extreme makeover - BigBase multi-level caching architecture and performance (20)

Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Basics of JVM Tuning
Basics of JVM TuningBasics of JVM Tuning
Basics of JVM Tuning
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
V mware virtual san 5.5 deep dive
V mware virtual san 5.5 deep diveV mware virtual san 5.5 deep dive
V mware virtual san 5.5 deep dive
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to Jewel
 
Ceph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelCeph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to Jewel
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
 
Hbase 20141003
Hbase 20141003Hbase 20141003
Hbase 20141003
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
 
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
 
Azure DBA with IaaS
Azure DBA with IaaSAzure DBA with IaaS
Azure DBA with IaaS
 
VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014
 

Recently uploaded

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 

Recently uploaded (20)

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

HBase: Extreme makeover - BigBase multi-level caching architecture and performance

  • 1. HBase: Extreme makeover Vladimir Rodionov Hadoop/HBase architect Founder of BigBase.org HBaseCon 2014 Features & Internal Track
  • 3. About myself • Principal Platform Engineer @Carrier IQ, Sunnyvale, CA • Prior to Carrier IQ, I worked @ GE, EBay, Plumtree/BEA. • HBase user since 2009. • HBase hacker since 2013. • Areas of expertise include (but not limited to) Java, HBase, Hadoop, Hive, large-scale OLAP/Analytics, and in- memory data processing. • Founder of BigBase.org
  • 5.
  • 10. BigBase = EM(HBase) EM(*) = Seriously? for HBase It’s a Multi-Level Caching solution
  • 11. Real Agenda • Why BigBase? • Brief history of BigBase.org project • BigBase MLC high level architecture (L1/L2/L3) • Level 1 - Row Cache. • Level 2/3 - Block Cache RAM/SSD. • YCSB benchmark results • Upcoming features in R1.5, 2.0, 3.0. • Q&A
  • 12.
  • 13. HBase • Still lacks some original BigTable’s features. • Still not able to utilize efficiently all RAM. • No good mixed storage (SSD/HDD) support. • Single Level Caching only. Simple. • HBase + Large JVM Heap (MemStore) = ?
  • 14. BigBase • Adds Row Cache and block cache compression. • Utilizes efficiently all RAM (TBs). • Supports mixed storage (SSD/HDD). • Has Multi Level Caching. Not that simple. • Will move MemStore off heap in R2.
  • 16. Koda (2010) • Koda - Java off heap object cache, similar to Terracotta’s BigMemory. • Delivers 4x times more transactions … • 10x times better latencies than BigMemory 4. • Compression (Snappy, LZ4, LZ4HC, Deflate). • Disk persistence and periodic cache snapshots. • Tested up to 240GB.
  • 17. Karma (2011-12) • Karma - Java off heap BTree implementation to support fast in memory queries. • Supports extra large heaps, 100s millions – billions objects. • Stores 300M objects in less than 10G of RAM. • Block Compression. • Tested up to 240GB. • Off Heap MemStore in R2.
  • 18. Yamm (2013) • Yet Another Memory Manager. – Pure 100% Java memory allocator. – Replaced jemalloc in Koda. – Now Koda is 100% Java. – Karma is the next (still on jemalloc). – Similar to memcached slab allocator. • BigBase project started (Summer 2013).
  • 20. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache
  • 21. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache HBase 0.96 Disk JVMRAM Bucket cache One level of caching : • RAM (L2)
  • 22. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache HBase 0.96 Bucket cache JVMRAM One level of caching : • RAM (L2) • Or DISK (L3)
  • 23. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache HBase 0.96 Disk JVMRAM Bucket cache BigBase 1.0 Block Cache L3 SSD JVMRAM Row Cache L1 Block Cache L2
  • 24. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache HBase 0.96 Disk JVMRAM Bucket cache BigBase 1.0 JVMRAM Row Cache L1 Block Cache L2 BlockCache L3 Network
  • 25. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache HBase 0.96 Disk JVMRAM Bucket cache BigBase 1.0 JVMRAM Row Cache L1 Block Cache L2 BlockCache L3 memcached
  • 26. MLC – Multi-Level Caching HBase 0.94 Disk JVMRAM LRUBlockCache HBase 0.96 Disk JVMRAM Bucket cache BigBase 1.0 JVMRAM Row Cache L1 Block Cache L2 BlockCache L3 DynamoDB
  • 28. Where is BigTable’s Scan Cache? • Scan Cache caches hot rows data. • Complimentary to Block Cache. • Still missing in HBase (as of 0.98). • It’s very hard to implement in Java (off heap). • Max GC pause is ~ 0.5-2 sec per 1GB of heap • G1 GC in Java 7 does not resolve the problem. • We call it Row Cache in BigBase.
  • 29. Row Cache vs. Block Cache HFile Block HFile BlockHFile BlockHFile BlockHFile Block
  • 30. Row Cache vs. Block Cache
  • 31. Row Cache vs. Block Cache BLOCK CACHE ROW CACHE
  • 32. Row Cache vs. Block Cache ROW CACHE BLOCK CACHE
  • 33. Row Cache vs. Block Cache ROW CACHE BLOCK CACHE
  • 34. BigBase Row Cache • Off Heap Scan Cache for HBase. • Cache size: 100’s of GBs to TBs. • Eviction policies: LRU, LFU, FIFO, Random. • Pure 100% - compatible Java. • Sub-millisecond latencies, zero GC. • Implemented as RegionObserver coprocessor. Row Cache YAMM Codecs Kryo SerDe KODA
  • 35. BigBase Row Cache • Read through cache. • It caches rowkey:CF. • Invalidates key on every mutation. • Can be enabled/disabled per table and per table:CF. • New ROWCACHE attribute. • Best for small rows (< block size) Row Cache YAMM Codecs Kryo SerDe KODA
  • 36. Performance-Scalability • GET (small rows < 100 bytes): 175K operations per sec per one Region Server (from cache). • MULTI-GET (small rows < 100 bytes): > 1M records per second (network limited) per one Region Server. • LATENCY : 99% < 1ms (for GETs) with 100K ops. • Vertical scalability: tested up to 240GB (the maximum available in Amazon EC2). • Horizontal scalability: limited by HBase scalability. • No more memcached farms in front of HBase clusters.
  • 38. What is wrong with Bucket Cache? Scalability LIMITED Multi-Level Caching (MLC) NOT SUPPORTED Persistence (‘offheap’ mode) NOT SUPPORTED Low latency apps NOT SUPPORTED SSD friendliness (‘file’ mode) NOT FRIENDLY Compression NOT SUPPORTED
  • 39. What is wrong with Bucket Cache? Scalability LIMITED Multi-Level Caching (MLC) NOT SUPPORTED Persistence (‘offheap’ mode) NOT SUPPORTED Low latency apps NOT SUPPORTED SSD friendliness (‘file’ mode) NOT FRIENDLY Compression NOT SUPPORTED
  • 40. What is wrong with Bucket Cache? Scalability LIMITED Multi-Level Caching (MLC) NOT SUPPORTED Persistence (‘offheap’ mode) NOT SUPPORTED Low latency apps NOT SUPPORTED SSD friendliness (‘file’ mode) NOT FRIENDLY Compression NOT SUPPORTED
  • 41. What is wrong with Bucket Cache? Scalability LIMITED Multi-Level Caching (MLC) NOT SUPPORTED Persistence (‘offheap’ mode) NOT SUPPORTED Low latency apps ? SSD friendliness (‘file’ mode) NOT FRIENDLY Compression NOT SUPPORTED
  • 42. What is wrong with Bucket Cache? Scalability LIMITED Multi-Level Caching (MLC) NOT SUPPORTED Persistence (‘offheap’ mode) NOT SUPPORTED Low latency apps NOT SUPPORTED SSD friendliness (‘file’ mode) NOT FRIENDLY Compression NOT SUPPORTED
  • 43. What is wrong with Bucket Cache? Scalability LIMITED Multi-Level Caching (MLC) NOT SUPPORTED Persistence (‘offheap’ mode) NOT SUPPORTED Low latency apps NOT SUPPORTED SSD friendliness (‘file’ mode) NOT FRIENDLY Compression NOT SUPPORTED
  • 44. Here comes BigBase Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE
  • 45. Here comes BigBase Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE
  • 46. Here comes BigBase Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE
  • 47. Here comes BigBase Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE
  • 48. Here comes BigBase Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE
  • 49. Here comes BigBase Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE
  • 50. Wait, there are more … Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE Non disk–based L3 cache SUPPORTED RAM Cache optimization IBCO
  • 51. Wait, there are more … Scalability HIGH Multi-Level Caching (MLC) SUPPORTED Persistence (‘offheap’ mode) SUPPORTED Low latency apps SUPPORTED SSD friendliness (‘file’ mode) SSD-FRIENDLY Compression SNAPPY, LZ4, LZHC, DEFLATE Non disk–based L3 cache SUPPORTED RAM Cache optimization IBCO
  • 52. BigBase 1.0 vs. HBase 0.98 BigBase HBase 0.98 Row Cache (L1) YES NO Block Cache RAM (L2) YES (fully off heap) YES (partially off heap) Block Cache (L3) DISK YES (SSD- friendly) YES (not SSD – friendly) Block Cache (L3) NON DISK YES NO Compression YES NO RAM Cache persistence YES (both L1 and L2) NO Low Latency optimized YES NO MLC support YES (L1, L2, L3) NO (either L2 or L3) Scalability HIGH MEDIUM (limited by JVM heap)
  • 54. Test setup (AWS) • HBase 0.94.15 – RS: 11.5GB heap (6GB LruBlockCache on heap); Master: 4GB heap. • Clients: 5 (30 threads each), collocated with Region Servers. • Data sets: 100M and 200M. 120GB / 240GB approximately. Only 25% fits in a cache. • Workloads: 100% read (read100, read200, hotspot100), 100% scan (scan100, scan200) –zipfian. • YCSB 0.1.4 (modified to generate compressible data). We generated compressible data (with factor of 2.5x) only for scan workloads to evaluate effect of compression in BigBase block cache implementation. • Common – Whirr 0.8.2; 1 (Master + Zk) + 5 RS; m1.xlarge: 15GB RAM, 4 vCPU, 4x420 HDD • BigBase 1.0 (0.94.15) – RS: 4GB heap (6GB off heap cache); Master: 4GB heap. • HBase 0.96.2 – RS: 4GB heap (6GB Bucket Cache off heap); Master: 4GB heap.
  • 55. Test setup (AWS) • HBase 0.94.15 – RS: 11.5GB heap (6GB LruBlockCache on heap); Master: 4GB heap. • Clients: 5 (30 threads each), collocated with Region Servers. • Data sets: 100M and 200M. 120GB / 240GB approximately. Only 25% fits in a cache. • Workloads: 100% read (read100, read200, hotspot100), 100% scan (scan100, scan200) –zipfian. • YCSB 0.1.4 (modified to generate compressible data). We generated compressible data (with factor of 2.5x) only for scan workloads to evaluate effect of compression in BigBase block cache implementation. • Common – Whirr 0.8.2; 1 (Master + Zk) + 5 RS; m1.xlarge: 15GB RAM, 4 vCPU, 4x420 HDD • BigBase 1.0 (0.94.15) – RS: 4GB heap (6GB off heap cache); Master: 4GB heap. • HBase 0.96.2 – RS: 4GB heap (6GB Bucket Cache off heap); Master: 4GB heap.
  • 56. Benchmark results (RPS) 11405 6123 5553 6265 4086 3850 15150 3512 28553224 1500 709820 434 228 0 2000 4000 6000 8000 10000 12000 14000 16000 BigBase R1.0 HBase 0.96.2 HBase 0.94.15 read100 read200 hotspot100 scan100 scan200
  • 57. Average latency (ms) 13 24 2723 36 3910 44 5248 102 223 187 375 700 0 100 200 300 400 500 600 700 800 BigBase R1.0 HBase 0.96.2 HBase 0.94.15 read100 read200 hotspot100 scan100 scan200
  • 58. 95% latency (ms) 51 91 10088 124 138 38 152 197175 405 950 729 0 100 200 300 400 500 600 700 800 900 1000 BigBase R1.0 HBase 0.96.2 HBase 0.94.15 read100 read200 hotspot100 scan100 scan200
  • 59. 99% latency (ms) 133 190 213225 304 338 111 554 632 367 811 0 100 200 300 400 500 600 700 800 900 BigBase R1.0 HBase 0.96.2 HBase 0.94.15 read100 read200 hotspot100 scan100 scan200
  • 60. YCSB 100% Read 3621 1308 2281 11111253 770 0 500 1000 1500 2000 2500 3000 3500 4000 BigBase R1.0 HBase 0.94.15 Per Server 50M 100M 200M • 50M = 2.77X • 100M = 2.05X • 200M = 1.63X • 50M = 40% fits cache • 100M = 20% fits cache • 200M = 10% fits cache • What is the maximum?
  • 61. YCSB 100% Read 3621 1308 2281 11111253 770 0 500 1000 1500 2000 2500 3000 3500 4000 BigBase R1.0 HBase 0.94.15 Per Server 50M 100M 200M • 50M = 2.77X • 100M = 2.05X • 200M = 1.63X • 50M = 40% fits cache • 100M = 20% fits cache • 200M = 10% fits cache • What is the maximum? • ~ 75X (hotspot 2.5/100) • 56K (BB) vs. 750 (HBase) • 100% in cache
  • 62. All data in cache • Setup: BigBase 1.0, 48G RAM, (8/16) CPU cores – 5 nodes (1+ 4) • Data set: 200M (300GB) • Test: Read 100%, hotspot • YCSB 0.1.4 – 4 clients • 40 threads – 100K • 100 threads – 168K • 200 threads – 224K • 400 threads - 262K 100,000 168,000 224,000 262,000 99% 1 2 3 7 95% 1 1 2 3 avg 0.4 0.6 0.9 1.5 0 1 2 3 4 5 6 7 8 Latency(ms) Hotspot (2.5/100 – 200M data)
  • 63. All data in cache • Setup: BigBase 1.0, 48G RAM, (8/16) CPU cores – 5 nodes (1+ 4) • Data set: 200M (300GB) • Test: Read 100%, hotspot • YCSB 0.1.4 – 4 clients • 40 threads – 100K • 100 threads – 168K • 200 threads – 224K • 400 threads - 262K 100,000 168,000 224,000 262,000 99% 1 2 3 7 95% 1 1 2 3 avg 0.4 0.6 0.9 1.5 0 1 2 3 4 5 6 7 8 Latency(ms) Hotspot (2.5/100 – 200M data) 100K ops: 99% < 1ms
  • 64. What is next? • Release 1.1 (2014 Q2) – Support HBase 0.96, 0.98, trunk – Fully tested L3 cache (SSD) • Release 1.5 (2014 Q3) – YAMM: memory allocator compacting mode . – Integration with Hadoop metrics. – Row Cache: merge rows on update (good for counters). – Block Cache: new eviction policy (LRU-2Q). – File read posix_fadvise ( bypass OS page cache). – Row Cache: make it available for server-side apps
  • 65. What is next? • Release 2.0 (2014 Q3) – HBASE-5263: Preserving cache data on compaction – Cache data blocks on memstore flush (configurable). – HBASE-10648: Pluggable Memstore. Off heap implementation, based on Karma (off heap BTree lib). • Release 3.0 (2014 Q4) – Real Scan Cache – caches results of Scan operations on immutable store files. – Scan Cache integration with Phoenix and with other 3rd party libs provided rich query API for HBase.
  • 66. Download/Install/Uninstall • Download BigBase 1.0 from www.bigbase.org • Installation/upgrade takes 10-20 minutes • Beatification operator EM(*) is invertible: HBase = EM-1(BigBase) (the same 10-20 min)
  • 67. Q & A Vladimir Rodionov Hadoop/HBase architect Founder of BigBase.org HBase: Extreme makeover Features & Internal Track