SlideShare a Scribd company logo
1 of 67
Download to read offline
HBase:	
  Extreme	
  makeover	
  
Vladimir	
  Rodionov	
  
Hadoop/HBase	
  architect	
  
Founder	
  of	
  BigBase.org	
  
HBaseCon	
  2014	
  
Features	
  &	
  Internal	
  Track	
  
Agenda	
  
About	
  myself	
  
•  Principal	
  PlaKorm	
  Engineer	
  @Carrier	
  IQ,	
  Sunnyvale,	
  CA	
  	
  
•  Prior	
  to	
  Carrier	
  IQ,	
  I	
  worked	
  @	
  GE,	
  EBay,	
  Plumtree/BEA.	
  
•  HBase	
  user	
  since	
  2009.	
  
•  HBase	
  hacker	
  since	
  2013.	
  
•  Areas	
  of	
  experTse	
  include	
  (but	
  not	
  limited	
  to)	
  Java,	
  
HBase,	
  Hadoop,	
  Hive,	
  large-­‐scale	
  OLAP/AnalyTcs,	
  and	
  in-­‐
memory	
  data	
  processing.	
  
•  Founder	
  of	
  BigBase.org	
  
What?	
  
BigBase	
  =	
  EM(HBase)	
  
BigBase	
  =	
  EM(HBase)	
  
EM(*)	
  =	
  ?	
  
BigBase	
  =	
  EM(HBase)	
  
EM(*)	
  =	
  
BigBase	
  =	
  EM(HBase)	
  
EM(*)	
  =	
  
Seriously?	
  
BigBase	
  =	
  EM(HBase)	
  
EM(*)	
  =	
  
Seriously?	
  
for	
  HBase	
  
It’s	
  a	
  MulT-­‐Level	
  Caching	
  soluTon	
  
Real	
  Agenda	
  
•  Why	
  BigBase?	
  
•  Brief	
  history	
  of	
  BigBase.org	
  project	
  
•  BigBase	
  MLC	
  high	
  level	
  architecture	
  (L1/L2/L3)	
  
•  Level	
  1	
  -­‐	
  Row	
  Cache.	
  
•  Level	
  2/3	
  -­‐	
  Block	
  Cache	
  RAM/SSD.	
  
•  YCSB	
  benchmark	
  results	
  
•  Upcoming	
  features	
  in	
  R1.5,	
  2.0,	
  3.0.	
  
•  Q&A	
  
HBase	
  
•  STll	
  lacks	
  some	
  original	
  BigTable’s	
  features.	
  
•  STll	
  not	
  able	
  to	
  uTlize	
  efficiently	
  all	
  RAM.	
  	
  
•  No	
  good	
  mixed	
  storage	
  (SSD/HDD)	
  support.	
  	
  
•  Single	
  Level	
  Caching	
  only.	
  Simple.	
  	
  
•  HBase	
  +	
  Large	
  JVM	
  Heap	
  (MemStore)	
  =	
  ?	
  
BigBase	
  
•  Adds	
  Row	
  Cache	
  and	
  block	
  cache	
  compression.	
  
•  UTlizes	
  efficiently	
  all	
  RAM	
  (TBs).	
  	
  
•  Supports	
  mixed	
  storage	
  (SSD/HDD).	
  	
  
•  Has	
  MulT	
  Level	
  Caching.	
  Not	
  that	
  simple.	
  	
  
•  Will	
  move	
  MemStore	
  off	
  heap	
  in	
  	
  R2.	
  
BigBase	
  History	
  
Koda	
  (2010)	
  
•  Koda	
  -­‐	
  Java	
  off	
  heap	
  object	
  cache,	
  similar	
  to	
  
Terracola’s	
  BigMemory.	
  
•  Delivers	
  4x	
  Tmes	
  more	
  transacTons	
  …	
  
•  10x	
  Tmes	
  beler	
  latencies	
  than	
  BigMemory	
  4.	
  
•  Compression	
  (Snappy,	
  LZ4,	
  LZ4HC,	
  Deflate).	
  
•  Disk	
  persistence	
  and	
  periodic	
  cache	
  snapshots.	
  
•  Tested	
  up	
  to	
  240GB.	
  
Karma	
  (2011-­‐12)	
  
•  Karma	
  -­‐	
  Java	
  off	
  heap	
  BTree	
  implementaTon	
  to	
  
support	
  fast	
  in	
  memory	
  queries.	
  
•  Supports	
  extra	
  large	
  heaps,	
  100s	
  millions	
  –	
  billions	
  
objects.	
  
•  Stores	
  300M	
  objects	
  in	
  less	
  than	
  10G	
  of	
  RAM.	
  
•  Block	
  Compression.	
  
•  Tested	
  up	
  to	
  240GB.	
  
•  Off	
  Heap	
  MemStore	
  in	
  R2.	
  
Yamm	
  (2013)	
  
•  Yet	
  Another	
  Memory	
  Manager.	
  
– Pure	
  100%	
  Java	
  memory	
  allocator.	
  
– Replaced	
  jemalloc	
  in	
  Koda.	
  	
  
– Now	
  Koda	
  is	
  100%	
  Java.	
  
– Karma	
  is	
  the	
  next	
  (sTll	
  on	
  jemalloc).	
  
– Similar	
  to	
  memcached	
  slab	
  allocator.	
  
•  BigBase	
  project	
  started	
  (Summer	
  2013).	
  
BigBase	
  Architecture	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
HBase	
  0.96	
  
Disk	
  
JVM	
  	
  RAM	
  
Bucket	
  cache	
  
One	
  level	
  of	
  caching	
  :	
  	
  
•  RAM	
  (L2)	
  	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
HBase	
  0.96	
  
Bucket	
  cache	
  
JVM	
  	
  RAM	
  
One	
  level	
  of	
  caching	
  :	
  	
  
•  RAM	
  (L2)	
  
•  Or	
  DISK	
  (L3)	
  	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
HBase	
  0.96	
  
Disk	
  
JVM	
  	
  RAM	
  
Bucket	
  cache	
  
BigBase	
  1.0	
  
Block	
  Cache	
  L3	
  
SSD	
  
JVM	
  	
  RAM	
  
Row	
  Cache	
  L1	
  
Block	
  Cache	
  L2	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
HBase	
  0.96	
  
Disk	
  
JVM	
  	
  RAM	
  
Bucket	
  cache	
  
BigBase	
  1.0	
  
JVM	
  	
  RAM	
  
Row	
  Cache	
  L1	
  
Block	
  Cache	
  L2	
  
BlockCache	
  L3	
  
Network	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
HBase	
  0.96	
  
Disk	
  
JVM	
  	
  RAM	
  
Bucket	
  cache	
  
BigBase	
  1.0	
  
JVM	
  	
  RAM	
  
Row	
  Cache	
  L1	
  
Block	
  Cache	
  L2	
  
BlockCache	
  L3	
  
memcached	
  
MLC	
  –	
  MulT-­‐Level	
  Caching	
  
HBase	
  0.94	
  
Disk	
  
JVM	
  	
  RAM	
  
LRUBlockCache	
  
HBase	
  0.96	
  
Disk	
  
JVM	
  	
  RAM	
  
Bucket	
  cache	
  
BigBase	
  1.0	
  
JVM	
  	
  RAM	
  
Row	
  Cache	
  L1	
  
Block	
  Cache	
  L2	
  
BlockCache	
  L3	
  
DynamoDB	
  
BigBase	
  Row	
  Cache	
  (L1)	
  
Where	
  is	
  BigTable’s	
  Scan	
  Cache?	
  
•  Scan	
  Cache	
  caches	
  hot	
  rows	
  data.	
  	
  
•  Complimentary	
  to	
  Block	
  Cache.	
  
•  STll	
  missing	
  in	
  HBase	
  (as	
  of	
  0.98).	
  	
  	
  
•  It’s	
  very	
  hard	
  to	
  implement	
  in	
  Java	
  (off	
  heap).	
  
•  Max	
  GC	
  pause	
  	
  is	
  ~	
  0.5-­‐2	
  sec	
  per	
  1GB	
  of	
  heap	
  
•  G1	
  GC	
  in	
  Java	
  7	
  does	
  not	
  resolve	
  the	
  problem.	
  
•  We	
  call	
  it	
  Row	
  Cache	
  in	
  BigBase.	
  
	
  
	
  
Row	
  Cache	
  vs.	
  Block	
  Cache	
  
HFile	
  Block	
   HFile	
  Block	
  HFile	
  Block	
  HFile	
  Block	
  HFile	
  Block	
  
Row	
  Cache	
  vs.	
  Block	
  Cache	
  
Row	
  Cache	
  vs.	
  Block	
  Cache	
  
BLOCK	
  CACHE	
  
ROW	
  CACHE	
  
Row	
  Cache	
  vs.	
  Block	
  Cache	
  
ROW	
  CACHE	
  
BLOCK	
  CACHE	
  
Row	
  Cache	
  vs.	
  Block	
  Cache	
  
ROW	
  CACHE	
  
BLOCK	
  CACHE	
  
BigBase	
  Row	
  Cache	
  
•  Off	
  Heap	
  Scan	
  Cache	
  	
  for	
  HBase.	
  
•  Cache	
  size:	
  100’s	
  of	
  GBs	
  to	
  TBs.	
  	
  
•  EvicTon	
  policies:	
  LRU,	
  LFU,	
  FIFO,	
  
Random.	
  	
  
•  Pure	
  100%	
  -­‐	
  compaTble	
  Java.	
  	
  
•  Sub-­‐millisecond	
  latencies,	
  zero	
  GC.	
  
•  Implemented	
  as	
  RegionObserver	
  
coprocessor.	
  
	
  	
  
Row	
  Cache	
  
YAMM	
   Codecs	
  
Kryo	
  
SerDe	
  
KODA	
  
BigBase	
  Row	
  Cache	
  
•  Read	
  through	
  cache.	
  	
  
•  It	
  caches	
  rowkey:CF.	
  	
  
•  Invalidates	
  key	
  on	
  every	
  mutaTon.	
  
•  Can	
  be	
  enabled/disabled	
  per	
  table	
  
and	
  per	
  table:CF.	
  
•  New	
  ROWCACHE	
  alribute.	
  
•  Best	
  for	
  small	
  rows	
  (<	
  block	
  size)	
  
	
  	
  
Row	
  Cache	
  
YAMM	
   Codecs	
  
Kryo	
  
SerDe	
  
KODA	
  
Performance-­‐Scalability	
  
•  GET	
  (small	
  rows	
  <	
  100	
  bytes):	
  175K	
  operaTons	
  per	
  sec	
  
per	
  one	
  Region	
  Server	
  (from	
  cache).	
  
•  MULTI-­‐GET	
  (small	
  rows	
  <	
  100	
  bytes):	
  >	
  1M	
  records	
  per	
  
second	
  (network	
  limited)	
  per	
  one	
  Region	
  Server.	
  
•  LATENCY	
  :	
  	
  99%	
  <	
  1ms	
  (for	
  GETs)	
  with	
  100K	
  ops.	
  
•  VerTcal	
  scalability:	
  tested	
  up	
  to	
  240GB	
  (the	
  maximum	
  
available	
  in	
  Amazon	
  EC2).	
  
•  Horizontal	
  scalability:	
  limited	
  by	
  HBase	
  scalability.	
  	
  
•  No	
  more	
  memcached	
  farms	
  in	
  front	
  of	
  HBase	
  clusters.	
  
BigBase	
  Block	
  Cache	
  (L2,	
  L3)	
  
What	
  is	
  wrong	
  with	
  Bucket	
  Cache?	
  
Scalability	
   LIMITED	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   NOT	
  SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   NOT	
  SUPPORTED	
  
Low	
  latency	
  apps	
   NOT	
  SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   NOT	
  FRIENDLY	
  
Compression	
   NOT	
  SUPPORTED	
  
What	
  is	
  wrong	
  with	
  Bucket	
  Cache?	
  
Scalability	
   LIMITED	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   NOT	
  SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   NOT	
  SUPPORTED	
  
Low	
  latency	
  apps	
   NOT	
  SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   NOT	
  FRIENDLY	
  
Compression	
   NOT	
  SUPPORTED	
  
What	
  is	
  wrong	
  with	
  Bucket	
  Cache?	
  
Scalability	
   LIMITED	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   NOT	
  SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   NOT	
  SUPPORTED	
  
Low	
  latency	
  apps	
   NOT	
  SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   NOT	
  FRIENDLY	
  
Compression	
   NOT	
  SUPPORTED	
  
What	
  is	
  wrong	
  with	
  Bucket	
  Cache?	
  
Scalability	
   LIMITED	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   NOT	
  SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   NOT	
  SUPPORTED	
  
Low	
  latency	
  apps	
   ?	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   NOT	
  FRIENDLY	
  
Compression	
   NOT	
  SUPPORTED	
  
What	
  is	
  wrong	
  with	
  Bucket	
  Cache?	
  
Scalability	
   LIMITED	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   NOT	
  SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   NOT	
  SUPPORTED	
  
Low	
  latency	
  apps	
   NOT	
  SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   NOT	
  FRIENDLY	
  
Compression	
   NOT	
  SUPPORTED	
  
What	
  is	
  wrong	
  with	
  Bucket	
  Cache?	
  
Scalability	
   LIMITED	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   NOT	
  SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   NOT	
  SUPPORTED	
  
Low	
  latency	
  apps	
   NOT	
  SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   NOT	
  FRIENDLY	
  
Compression	
   NOT	
  SUPPORTED	
  
Here	
  comes	
  BigBase	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Here	
  comes	
  BigBase	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Here	
  comes	
  BigBase	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Here	
  comes	
  BigBase	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Here	
  comes	
  BigBase	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Here	
  comes	
  BigBase	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Wait,	
  there	
  are	
  more	
  …	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Non	
  disk–based	
  L3	
  cache	
   SUPPORTED	
  
RAM	
  Cache	
  opTmizaTon	
   IBCO	
  
Wait,	
  there	
  are	
  more	
  …	
  
Scalability	
   HIGH	
  
MulT-­‐Level	
  Caching	
  (MLC)	
   SUPPORTED	
  
Persistence	
  (‘ozeap’	
  mode)	
   SUPPORTED	
  
Low	
  latency	
  apps	
   SUPPORTED	
  
SSD	
  friendliness	
  (‘file’	
  mode)	
   SSD-­‐FRIENDLY	
  
Compression	
   SNAPPY,	
  LZ4,	
  LZHC,	
  DEFLATE	
  
Non	
  disk–based	
  L3	
  cache	
   SUPPORTED	
  
RAM	
  Cache	
  opTmizaTon	
   IBCO	
  
BigBase	
  1.0	
  vs.	
  HBase	
  0.98	
  
BigBase	
   HBase	
  0.98	
  
Row	
  Cache	
  (L1)	
   YES	
   NO	
  
Block	
  Cache	
  RAM	
  (L2)	
   YES	
  (fully	
  off	
  heap)	
   YES	
  (parTally	
  off	
  heap)	
  
Block	
  Cache	
  (L3)	
  DISK	
   YES	
  (SSD-­‐	
  friendly)	
   YES	
  (not	
  SSD	
  –	
  friendly)	
  
Block	
  Cache	
  (L3)	
  NON	
  DISK	
   YES	
   NO	
  
Compression	
   YES	
   NO	
  
RAM	
  Cache	
  persistence	
   YES	
  (both	
  L1	
  and	
  L2)	
   NO	
  
Low	
  Latency	
  opTmized	
   YES	
   NO	
  
MLC	
  support	
   YES	
  (L1,	
  L2,	
  L3)	
   NO	
  (either	
  L2	
  or	
  L3)	
  
Scalability	
   HIGH	
   MEDIUM	
  (limited	
  by	
  JVM	
  heap)	
  
YCSB	
  Benchmark	
  
Test	
  setup	
  (AWS)	
  
•  HBase	
  0.94.15	
  –	
  RS:	
  11.5GB	
  heap	
  (6GB	
  LruBlockCache	
  on	
  heap);	
  Master:	
  4GB	
  heap.	
  
	
  
	
  
•  Clients:	
  5	
  (30	
  threads	
  each),	
  collocated	
  with	
  Region	
  Servers.	
  
•  Data	
  sets:	
  100M	
  and	
  200M.	
  120GB	
  /	
  240GB	
  approximately.	
  Only	
  25%	
  fits	
  in	
  a	
  cache.	
  	
  
•  Workloads:	
  100%	
  read	
  (read100,	
  read200,	
  hotspot100),	
  100%	
  scan	
  (scan100,	
  scan200)	
  –zipfian.	
  
•  YCSB	
  0.1.4	
  (modified	
  to	
  generate	
  compressible	
  data).	
  We	
  generated	
  compressible	
  data	
  
(with	
  factor	
  of	
  2.5x)	
  only	
  for	
  scan	
  workloads	
  to	
  evaluate	
  effect	
  of	
  compression	
  in	
  BigBase	
  
block	
  cache	
  implementaTon.	
  
•  Common	
  –	
  Whirr	
  0.8.2;	
  1	
  (Master	
  +	
  Zk)	
  +	
  5	
  RS;	
  m1.xlarge:	
  15GB	
  RAM,	
  4	
  vCPU,	
  4x420	
  HDD	
  
	
  
	
  •  BigBase	
  1.0	
  (0.94.15)	
  –	
  RS:	
  4GB	
  heap	
  (6GB	
  off	
  heap	
  cache);	
  Master:	
  4GB	
  heap.	
  
•  HBase	
  0.96.2	
  –	
  RS:	
  4GB	
  heap	
  (6GB	
  Bucket	
  Cache	
  off	
  heap);	
  Master:	
  4GB	
  heap.	
  
	
  
	
  
Test	
  setup	
  (AWS)	
  
•  HBase	
  0.94.15	
  –	
  RS:	
  11.5GB	
  heap	
  (6GB	
  LruBlockCache	
  on	
  heap);	
  Master:	
  4GB	
  heap.	
  
	
  
	
  
•  Clients:	
  5	
  (30	
  threads	
  each),	
  collocated	
  with	
  Region	
  Servers.	
  
•  Data	
  sets:	
  100M	
  and	
  200M.	
  120GB	
  /	
  240GB	
  approximately.	
  Only	
  25%	
  fits	
  in	
  a	
  cache.	
  	
  
•  Workloads:	
  100%	
  read	
  (read100,	
  read200,	
  hotspot100),	
  100%	
  scan	
  (scan100,	
  scan200)	
  –zipfian.	
  
•  YCSB	
  0.1.4	
  (modified	
  to	
  generate	
  compressible	
  data).	
  We	
  generated	
  compressible	
  data	
  
(with	
  factor	
  of	
  2.5x)	
  only	
  for	
  scan	
  workloads	
  to	
  evaluate	
  effect	
  of	
  compression	
  in	
  BigBase	
  
block	
  cache	
  implementaTon.	
  
•  Common	
  –	
  Whirr	
  0.8.2;	
  1	
  (Master	
  +	
  Zk)	
  +	
  5	
  RS;	
  m1.xlarge:	
  15GB	
  RAM,	
  4	
  vCPU,	
  4x420	
  HDD	
  
	
  
	
  •  BigBase	
  1.0	
  (0.94.15)	
  –	
  RS:	
  4GB	
  heap	
  (6GB	
  off	
  heap	
  cache);	
  Master:	
  4GB	
  heap.	
  
•  HBase	
  0.96.2	
  –	
  RS:	
  4GB	
  heap	
  (6GB	
  Bucket	
  Cache	
  off	
  heap);	
  Master:	
  4GB	
  heap.	
  
	
  
	
  
Benchmark	
  results	
  (RPS)	
  
11405	
  
6123	
  
5553	
  
6265	
  
4086	
   3850	
  
15150	
  
3512	
  
2855	
  3224	
  
1500	
  
709	
  820	
   434	
   228	
  
0	
  
2000	
  
4000	
  
6000	
  
8000	
  
10000	
  
12000	
  
14000	
  
16000	
  
BigBase	
  R1.0	
   HBase	
  0.96.2	
   HBase	
  0.94.15	
  
read100	
  
read200	
  
hotspot100	
  
scan100	
  
scan200	
  
Average	
  latency	
  (ms)	
  
13	
   24	
   27	
  23	
   36	
   39	
  10	
  
44	
   52	
  48	
  
102	
  
223	
  
187	
  
375	
  
700	
  
0	
  
100	
  
200	
  
300	
  
400	
  
500	
  
600	
  
700	
  
800	
  
BigBase	
  R1.0	
   HBase	
  0.96.2	
   HBase	
  0.94.15	
  
read100	
  
read200	
  
hotspot100	
  
scan100	
  
scan200	
  
95%	
  latency	
  (ms)	
  
51	
  
91	
   100	
  88	
   124	
   138	
  
38	
  
152	
  
197	
  175	
  
405	
  
950	
  
729	
  
0	
  
100	
  
200	
  
300	
  
400	
  
500	
  
600	
  
700	
  
800	
  
900	
  
1000	
  
BigBase	
  R1.0	
   HBase	
  0.96.2	
   HBase	
  0.94.15	
  
read100	
  
read200	
  
hotspot100	
  
scan100	
  
scan200	
  
99%	
  latency	
  (ms)	
  
133	
  
190	
   213	
  225	
  
304	
  
338	
  
111	
  
554	
  
632	
  
367	
  
811	
  
0	
  
100	
  
200	
  
300	
  
400	
  
500	
  
600	
  
700	
  
800	
  
900	
  
BigBase	
  R1.0	
   HBase	
  0.96.2	
   HBase	
  0.94.15	
  
read100	
  
read200	
  
hotspot100	
  
scan100	
  
scan200	
  
YCSB	
  100%	
  Read	
  
3621	
  
1308	
  
2281	
  
1111	
  1253	
  
770	
  
0	
  
500	
  
1000	
  
1500	
  
2000	
  
2500	
  
3000	
  
3500	
  
4000	
  
BigBase	
  R1.0	
   HBase	
  0.94.15	
  
Per	
  Server	
  
50M	
   100M	
   200M	
  
•  50M	
  	
  	
  =	
  2.77X	
  
•  100M	
  =	
  2.05X	
  
•  200M	
  =	
  1.63X	
  
•  50M	
  =	
  40%	
  fits	
  cache	
  
•  100M	
  =	
  20%	
  fits	
  cache	
  
•  200M	
  =	
  10%	
  fits	
  cache	
  
•  What	
  is	
  the	
  maximum?	
  
YCSB	
  100%	
  Read	
  
3621	
  
1308	
  
2281	
  
1111	
  1253	
  
770	
  
0	
  
500	
  
1000	
  
1500	
  
2000	
  
2500	
  
3000	
  
3500	
  
4000	
  
BigBase	
  R1.0	
   HBase	
  0.94.15	
  
Per	
  Server	
  
50M	
   100M	
   200M	
  
•  50M	
  	
  	
  =	
  2.77X	
  
•  100M	
  =	
  2.05X	
  
•  200M	
  =	
  1.63X	
  
•  50M	
  =	
  40%	
  fits	
  cache	
  
•  100M	
  =	
  20%	
  fits	
  cache	
  
•  200M	
  =	
  10%	
  fits	
  cache	
  
•  What	
  is	
  the	
  maximum?	
  
•  ~	
  75X	
  (hotspot	
  2.5/100)	
  
•  56K	
  (BB)	
  vs.	
  750	
  (HBase)	
  
•  100%	
  in	
  cache	
  
All	
  data	
  in	
  cache	
  
•  Setup:	
  BigBase	
  1.0,	
  48G	
  
RAM,	
  (8/16)	
  CPU	
  cores	
  –	
  5	
  
nodes	
  (1+	
  4)	
  
•  Data	
  set:	
  200M	
  (300GB)	
  	
  
•  Test:	
  Read	
  100%,	
  hotspot	
  
•  YCSB	
  0.1.4	
  –	
  4	
  clients	
  
•  40	
  threads	
  –	
  	
  	
  100K	
  
•  100	
  threads	
  –	
  168K	
  
•  200	
  threads	
  –	
  224K	
  
•  400	
  threads	
  -­‐	
  	
  262K	
  	
  
100,000	
   168,000	
   224,000	
   262,000	
  
99%	
   1	
   2	
   3	
   7	
  
95%	
   1	
   1	
   2	
   3	
  
avg	
   0.4	
   0.6	
   0.9	
   1.5	
  
0	
  
1	
  
2	
  
3	
  
4	
  
5	
  
6	
  
7	
  
8	
  
Latency	
  (ms)	
  
Hotspot	
  (2.5/100	
  –	
  200M	
  data)	
  
All	
  data	
  in	
  cache	
  
•  Setup:	
  BigBase	
  1.0,	
  48G	
  
RAM,	
  (8/16)	
  CPU	
  cores	
  –	
  5	
  
nodes	
  (1+	
  4)	
  
•  Data	
  set:	
  200M	
  (300GB)	
  	
  
•  Test:	
  Read	
  100%,	
  hotspot	
  
•  YCSB	
  0.1.4	
  –	
  4	
  clients	
  
•  40	
  threads	
  –	
  	
  	
  100K	
  
•  100	
  threads	
  –	
  168K	
  
•  200	
  threads	
  –	
  224K	
  
•  400	
  threads	
  -­‐	
  	
  262K	
  	
  
100,000	
   168,000	
   224,000	
   262,000	
  
99%	
   1	
   2	
   3	
   7	
  
95%	
   1	
   1	
   2	
   3	
  
avg	
   0.4	
   0.6	
   0.9	
   1.5	
  
0	
  
1	
  
2	
  
3	
  
4	
  
5	
  
6	
  
7	
  
8	
  
Latency	
  (ms)	
  
Hotspot	
  (2.5/100	
  –	
  200M	
  data)	
  
100K	
  ops:	
  99%	
  <	
  1ms	
  
What	
  is	
  next?	
  
•  Release	
  1.1	
  (2014	
  Q2)	
  
–  Support	
  HBase	
  0.96,	
  0.98,	
  trunk	
  
–  Fully	
  tested	
  L3	
  cache	
  (SSD)	
  
•  Release	
  1.5	
  (2014	
  Q3)	
  
–  YAMM:	
  memory	
  allocator	
  compacTng	
  mode	
  .	
  
–  IntegraTon	
  with	
  Hadoop	
  metrics.	
  
–  Row	
  Cache:	
  merge	
  rows	
  on	
  update	
  (good	
  for	
  counters).	
  
–  Block	
  Cache:	
  new	
  evicTon	
  policy	
  (LRU-­‐2Q).	
  
–  File	
  read	
  posix_fadvise	
  (	
  bypass	
  OS	
  page	
  cache).	
  
–  Row	
  Cache:	
  make	
  it	
  available	
  for	
  server-­‐side	
  apps	
  
What	
  is	
  next?	
  	
  
•  Release	
  2.0	
  (2014	
  Q3)	
  
–  HBASE-­‐5263:	
  Preserving	
  cache	
  data	
  on	
  compacTon	
  
–  Cache	
  data	
  blocks	
  on	
  memstore	
  flush	
  (configurable).	
  	
  
–  HBASE-­‐10648:	
  Pluggable	
  Memstore.	
  Off	
  heap	
  
implementaTon,	
  based	
  on	
  Karma	
  (off	
  heap	
  BTree	
  lib).	
  
•  Release	
  3.0	
  (2014	
  Q4)	
  
–  Real	
  Scan	
  Cache	
  –	
  	
  caches	
  results	
  of	
  Scan	
  operaTons	
  on	
  
immutable	
  store	
  files.	
  
–  Scan	
  Cache	
  integraTon	
  with	
  Phoenix	
  and	
  with	
  other	
  3rd	
  
party	
  libs	
  provided	
  	
  rich	
  query	
  API	
  for	
  HBase.	
  	
  
	
  
Download/Install/Uninstall	
  
•  Download	
  BigBase	
  1.0	
  from	
  www.bigbase.org	
  
•  InstallaTon/upgrade	
  takes	
  10-­‐20	
  minutes	
  
•  BeaTficaTon	
  operator	
  EM(*)	
  is	
  inverTble:	
  
	
  
HBase	
  =	
  EM-­‐1(BigBase)	
  (the	
  same	
  10-­‐20	
  min)	
  
Q	
  &	
  A	
  	
  
Vladimir	
  Rodionov	
  
Hadoop/HBase	
  architect	
  
Founder	
  of	
  BigBase.org	
  
HBase:	
  Extreme	
  makeover	
  
Features	
  &	
  Internal	
  Track	
  

More Related Content

What's hot

HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini Cloudera, Inc.
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardMatthew Blair
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementDataWorks Summit
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBaseCon
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster Cloudera, Inc.
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction HBaseCon
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 

What's hot (20)

HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance Measurement
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 

Viewers also liked

Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudHBaseCon
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBaseCon
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at XiaomiHBaseCon
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the BasicsHBaseCon
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseCloudera, Inc.
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterCloudera, Inc.
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.Cloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...Cloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBaseHBaseCon
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARNHBaseCon
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!Cloudera, Inc.
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 

Viewers also liked (20)

Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at Xiaomi
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the Basics
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 

Similar to HBase: Extreme Makeover

HBase: Extreme makeover
HBase: Extreme makeoverHBase: Extreme makeover
HBase: Extreme makeoverbigbase
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeMichael Stack
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelColleen Corrice
 
Ceph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelCeph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelRed_Hat_Storage
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Heapoff memory wtf
Heapoff memory wtfHeapoff memory wtf
Heapoff memory wtfOlivier Lamy
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Javasunnygleason
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsSpeedment, Inc.
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheDavid Grier
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
 
Handling Massive Writes
Handling Massive WritesHandling Massive Writes
Handling Massive WritesLiran Zelkha
 
Introduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlIntroduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlLeon Chen
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best PracticesVenu Anuganti
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red_Hat_Storage
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Colin Charles
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
 

Similar to HBase: Extreme Makeover (20)

HBase: Extreme makeover
HBase: Extreme makeoverHBase: Extreme makeover
HBase: Extreme makeover
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in Java
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to Jewel
 
Ceph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to JewelCeph Performance: Projects Leading Up to Jewel
Ceph Performance: Projects Leading Up to Jewel
 
Basics of JVM Tuning
Basics of JVM TuningBasics of JVM Tuning
Basics of JVM Tuning
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Heapoff memory wtf
Heapoff memory wtfHeapoff memory wtf
Heapoff memory wtf
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
Handling Massive Writes
Handling Massive WritesHandling Massive Writes
Handling Massive Writes
 
Introduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlIntroduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission Control
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Recently uploaded

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 

Recently uploaded (20)

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 

HBase: Extreme Makeover

  • 1. HBase:  Extreme  makeover   Vladimir  Rodionov   Hadoop/HBase  architect   Founder  of  BigBase.org   HBaseCon  2014   Features  &  Internal  Track  
  • 3. About  myself   •  Principal  PlaKorm  Engineer  @Carrier  IQ,  Sunnyvale,  CA     •  Prior  to  Carrier  IQ,  I  worked  @  GE,  EBay,  Plumtree/BEA.   •  HBase  user  since  2009.   •  HBase  hacker  since  2013.   •  Areas  of  experTse  include  (but  not  limited  to)  Java,   HBase,  Hadoop,  Hive,  large-­‐scale  OLAP/AnalyTcs,  and  in-­‐ memory  data  processing.   •  Founder  of  BigBase.org  
  • 5.
  • 7. BigBase  =  EM(HBase)   EM(*)  =  ?  
  • 8. BigBase  =  EM(HBase)   EM(*)  =  
  • 9. BigBase  =  EM(HBase)   EM(*)  =   Seriously?  
  • 10. BigBase  =  EM(HBase)   EM(*)  =   Seriously?   for  HBase   It’s  a  MulT-­‐Level  Caching  soluTon  
  • 11. Real  Agenda   •  Why  BigBase?   •  Brief  history  of  BigBase.org  project   •  BigBase  MLC  high  level  architecture  (L1/L2/L3)   •  Level  1  -­‐  Row  Cache.   •  Level  2/3  -­‐  Block  Cache  RAM/SSD.   •  YCSB  benchmark  results   •  Upcoming  features  in  R1.5,  2.0,  3.0.   •  Q&A  
  • 12.
  • 13. HBase   •  STll  lacks  some  original  BigTable’s  features.   •  STll  not  able  to  uTlize  efficiently  all  RAM.     •  No  good  mixed  storage  (SSD/HDD)  support.     •  Single  Level  Caching  only.  Simple.     •  HBase  +  Large  JVM  Heap  (MemStore)  =  ?  
  • 14. BigBase   •  Adds  Row  Cache  and  block  cache  compression.   •  UTlizes  efficiently  all  RAM  (TBs).     •  Supports  mixed  storage  (SSD/HDD).     •  Has  MulT  Level  Caching.  Not  that  simple.     •  Will  move  MemStore  off  heap  in    R2.  
  • 16. Koda  (2010)   •  Koda  -­‐  Java  off  heap  object  cache,  similar  to   Terracola’s  BigMemory.   •  Delivers  4x  Tmes  more  transacTons  …   •  10x  Tmes  beler  latencies  than  BigMemory  4.   •  Compression  (Snappy,  LZ4,  LZ4HC,  Deflate).   •  Disk  persistence  and  periodic  cache  snapshots.   •  Tested  up  to  240GB.  
  • 17. Karma  (2011-­‐12)   •  Karma  -­‐  Java  off  heap  BTree  implementaTon  to   support  fast  in  memory  queries.   •  Supports  extra  large  heaps,  100s  millions  –  billions   objects.   •  Stores  300M  objects  in  less  than  10G  of  RAM.   •  Block  Compression.   •  Tested  up  to  240GB.   •  Off  Heap  MemStore  in  R2.  
  • 18. Yamm  (2013)   •  Yet  Another  Memory  Manager.   – Pure  100%  Java  memory  allocator.   – Replaced  jemalloc  in  Koda.     – Now  Koda  is  100%  Java.   – Karma  is  the  next  (sTll  on  jemalloc).   – Similar  to  memcached  slab  allocator.   •  BigBase  project  started  (Summer  2013).  
  • 20. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache  
  • 21. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache   HBase  0.96   Disk   JVM    RAM   Bucket  cache   One  level  of  caching  :     •  RAM  (L2)    
  • 22. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache   HBase  0.96   Bucket  cache   JVM    RAM   One  level  of  caching  :     •  RAM  (L2)   •  Or  DISK  (L3)    
  • 23. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache   HBase  0.96   Disk   JVM    RAM   Bucket  cache   BigBase  1.0   Block  Cache  L3   SSD   JVM    RAM   Row  Cache  L1   Block  Cache  L2  
  • 24. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache   HBase  0.96   Disk   JVM    RAM   Bucket  cache   BigBase  1.0   JVM    RAM   Row  Cache  L1   Block  Cache  L2   BlockCache  L3   Network  
  • 25. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache   HBase  0.96   Disk   JVM    RAM   Bucket  cache   BigBase  1.0   JVM    RAM   Row  Cache  L1   Block  Cache  L2   BlockCache  L3   memcached  
  • 26. MLC  –  MulT-­‐Level  Caching   HBase  0.94   Disk   JVM    RAM   LRUBlockCache   HBase  0.96   Disk   JVM    RAM   Bucket  cache   BigBase  1.0   JVM    RAM   Row  Cache  L1   Block  Cache  L2   BlockCache  L3   DynamoDB  
  • 28. Where  is  BigTable’s  Scan  Cache?   •  Scan  Cache  caches  hot  rows  data.     •  Complimentary  to  Block  Cache.   •  STll  missing  in  HBase  (as  of  0.98).       •  It’s  very  hard  to  implement  in  Java  (off  heap).   •  Max  GC  pause    is  ~  0.5-­‐2  sec  per  1GB  of  heap   •  G1  GC  in  Java  7  does  not  resolve  the  problem.   •  We  call  it  Row  Cache  in  BigBase.      
  • 29. Row  Cache  vs.  Block  Cache   HFile  Block   HFile  Block  HFile  Block  HFile  Block  HFile  Block  
  • 30. Row  Cache  vs.  Block  Cache  
  • 31. Row  Cache  vs.  Block  Cache   BLOCK  CACHE   ROW  CACHE  
  • 32. Row  Cache  vs.  Block  Cache   ROW  CACHE   BLOCK  CACHE  
  • 33. Row  Cache  vs.  Block  Cache   ROW  CACHE   BLOCK  CACHE  
  • 34. BigBase  Row  Cache   •  Off  Heap  Scan  Cache    for  HBase.   •  Cache  size:  100’s  of  GBs  to  TBs.     •  EvicTon  policies:  LRU,  LFU,  FIFO,   Random.     •  Pure  100%  -­‐  compaTble  Java.     •  Sub-­‐millisecond  latencies,  zero  GC.   •  Implemented  as  RegionObserver   coprocessor.       Row  Cache   YAMM   Codecs   Kryo   SerDe   KODA  
  • 35. BigBase  Row  Cache   •  Read  through  cache.     •  It  caches  rowkey:CF.     •  Invalidates  key  on  every  mutaTon.   •  Can  be  enabled/disabled  per  table   and  per  table:CF.   •  New  ROWCACHE  alribute.   •  Best  for  small  rows  (<  block  size)       Row  Cache   YAMM   Codecs   Kryo   SerDe   KODA  
  • 36. Performance-­‐Scalability   •  GET  (small  rows  <  100  bytes):  175K  operaTons  per  sec   per  one  Region  Server  (from  cache).   •  MULTI-­‐GET  (small  rows  <  100  bytes):  >  1M  records  per   second  (network  limited)  per  one  Region  Server.   •  LATENCY  :    99%  <  1ms  (for  GETs)  with  100K  ops.   •  VerTcal  scalability:  tested  up  to  240GB  (the  maximum   available  in  Amazon  EC2).   •  Horizontal  scalability:  limited  by  HBase  scalability.     •  No  more  memcached  farms  in  front  of  HBase  clusters.  
  • 37. BigBase  Block  Cache  (L2,  L3)  
  • 38. What  is  wrong  with  Bucket  Cache?   Scalability   LIMITED   MulT-­‐Level  Caching  (MLC)   NOT  SUPPORTED   Persistence  (‘ozeap’  mode)   NOT  SUPPORTED   Low  latency  apps   NOT  SUPPORTED   SSD  friendliness  (‘file’  mode)   NOT  FRIENDLY   Compression   NOT  SUPPORTED  
  • 39. What  is  wrong  with  Bucket  Cache?   Scalability   LIMITED   MulT-­‐Level  Caching  (MLC)   NOT  SUPPORTED   Persistence  (‘ozeap’  mode)   NOT  SUPPORTED   Low  latency  apps   NOT  SUPPORTED   SSD  friendliness  (‘file’  mode)   NOT  FRIENDLY   Compression   NOT  SUPPORTED  
  • 40. What  is  wrong  with  Bucket  Cache?   Scalability   LIMITED   MulT-­‐Level  Caching  (MLC)   NOT  SUPPORTED   Persistence  (‘ozeap’  mode)   NOT  SUPPORTED   Low  latency  apps   NOT  SUPPORTED   SSD  friendliness  (‘file’  mode)   NOT  FRIENDLY   Compression   NOT  SUPPORTED  
  • 41. What  is  wrong  with  Bucket  Cache?   Scalability   LIMITED   MulT-­‐Level  Caching  (MLC)   NOT  SUPPORTED   Persistence  (‘ozeap’  mode)   NOT  SUPPORTED   Low  latency  apps   ?   SSD  friendliness  (‘file’  mode)   NOT  FRIENDLY   Compression   NOT  SUPPORTED  
  • 42. What  is  wrong  with  Bucket  Cache?   Scalability   LIMITED   MulT-­‐Level  Caching  (MLC)   NOT  SUPPORTED   Persistence  (‘ozeap’  mode)   NOT  SUPPORTED   Low  latency  apps   NOT  SUPPORTED   SSD  friendliness  (‘file’  mode)   NOT  FRIENDLY   Compression   NOT  SUPPORTED  
  • 43. What  is  wrong  with  Bucket  Cache?   Scalability   LIMITED   MulT-­‐Level  Caching  (MLC)   NOT  SUPPORTED   Persistence  (‘ozeap’  mode)   NOT  SUPPORTED   Low  latency  apps   NOT  SUPPORTED   SSD  friendliness  (‘file’  mode)   NOT  FRIENDLY   Compression   NOT  SUPPORTED  
  • 44. Here  comes  BigBase   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE  
  • 45. Here  comes  BigBase   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE  
  • 46. Here  comes  BigBase   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE  
  • 47. Here  comes  BigBase   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE  
  • 48. Here  comes  BigBase   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE  
  • 49. Here  comes  BigBase   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE  
  • 50. Wait,  there  are  more  …   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE   Non  disk–based  L3  cache   SUPPORTED   RAM  Cache  opTmizaTon   IBCO  
  • 51. Wait,  there  are  more  …   Scalability   HIGH   MulT-­‐Level  Caching  (MLC)   SUPPORTED   Persistence  (‘ozeap’  mode)   SUPPORTED   Low  latency  apps   SUPPORTED   SSD  friendliness  (‘file’  mode)   SSD-­‐FRIENDLY   Compression   SNAPPY,  LZ4,  LZHC,  DEFLATE   Non  disk–based  L3  cache   SUPPORTED   RAM  Cache  opTmizaTon   IBCO  
  • 52. BigBase  1.0  vs.  HBase  0.98   BigBase   HBase  0.98   Row  Cache  (L1)   YES   NO   Block  Cache  RAM  (L2)   YES  (fully  off  heap)   YES  (parTally  off  heap)   Block  Cache  (L3)  DISK   YES  (SSD-­‐  friendly)   YES  (not  SSD  –  friendly)   Block  Cache  (L3)  NON  DISK   YES   NO   Compression   YES   NO   RAM  Cache  persistence   YES  (both  L1  and  L2)   NO   Low  Latency  opTmized   YES   NO   MLC  support   YES  (L1,  L2,  L3)   NO  (either  L2  or  L3)   Scalability   HIGH   MEDIUM  (limited  by  JVM  heap)  
  • 54. Test  setup  (AWS)   •  HBase  0.94.15  –  RS:  11.5GB  heap  (6GB  LruBlockCache  on  heap);  Master:  4GB  heap.       •  Clients:  5  (30  threads  each),  collocated  with  Region  Servers.   •  Data  sets:  100M  and  200M.  120GB  /  240GB  approximately.  Only  25%  fits  in  a  cache.     •  Workloads:  100%  read  (read100,  read200,  hotspot100),  100%  scan  (scan100,  scan200)  –zipfian.   •  YCSB  0.1.4  (modified  to  generate  compressible  data).  We  generated  compressible  data   (with  factor  of  2.5x)  only  for  scan  workloads  to  evaluate  effect  of  compression  in  BigBase   block  cache  implementaTon.   •  Common  –  Whirr  0.8.2;  1  (Master  +  Zk)  +  5  RS;  m1.xlarge:  15GB  RAM,  4  vCPU,  4x420  HDD      •  BigBase  1.0  (0.94.15)  –  RS:  4GB  heap  (6GB  off  heap  cache);  Master:  4GB  heap.   •  HBase  0.96.2  –  RS:  4GB  heap  (6GB  Bucket  Cache  off  heap);  Master:  4GB  heap.      
  • 55. Test  setup  (AWS)   •  HBase  0.94.15  –  RS:  11.5GB  heap  (6GB  LruBlockCache  on  heap);  Master:  4GB  heap.       •  Clients:  5  (30  threads  each),  collocated  with  Region  Servers.   •  Data  sets:  100M  and  200M.  120GB  /  240GB  approximately.  Only  25%  fits  in  a  cache.     •  Workloads:  100%  read  (read100,  read200,  hotspot100),  100%  scan  (scan100,  scan200)  –zipfian.   •  YCSB  0.1.4  (modified  to  generate  compressible  data).  We  generated  compressible  data   (with  factor  of  2.5x)  only  for  scan  workloads  to  evaluate  effect  of  compression  in  BigBase   block  cache  implementaTon.   •  Common  –  Whirr  0.8.2;  1  (Master  +  Zk)  +  5  RS;  m1.xlarge:  15GB  RAM,  4  vCPU,  4x420  HDD      •  BigBase  1.0  (0.94.15)  –  RS:  4GB  heap  (6GB  off  heap  cache);  Master:  4GB  heap.   •  HBase  0.96.2  –  RS:  4GB  heap  (6GB  Bucket  Cache  off  heap);  Master:  4GB  heap.      
  • 56. Benchmark  results  (RPS)   11405   6123   5553   6265   4086   3850   15150   3512   2855  3224   1500   709  820   434   228   0   2000   4000   6000   8000   10000   12000   14000   16000   BigBase  R1.0   HBase  0.96.2   HBase  0.94.15   read100   read200   hotspot100   scan100   scan200  
  • 57. Average  latency  (ms)   13   24   27  23   36   39  10   44   52  48   102   223   187   375   700   0   100   200   300   400   500   600   700   800   BigBase  R1.0   HBase  0.96.2   HBase  0.94.15   read100   read200   hotspot100   scan100   scan200  
  • 58. 95%  latency  (ms)   51   91   100  88   124   138   38   152   197  175   405   950   729   0   100   200   300   400   500   600   700   800   900   1000   BigBase  R1.0   HBase  0.96.2   HBase  0.94.15   read100   read200   hotspot100   scan100   scan200  
  • 59. 99%  latency  (ms)   133   190   213  225   304   338   111   554   632   367   811   0   100   200   300   400   500   600   700   800   900   BigBase  R1.0   HBase  0.96.2   HBase  0.94.15   read100   read200   hotspot100   scan100   scan200  
  • 60. YCSB  100%  Read   3621   1308   2281   1111  1253   770   0   500   1000   1500   2000   2500   3000   3500   4000   BigBase  R1.0   HBase  0.94.15   Per  Server   50M   100M   200M   •  50M      =  2.77X   •  100M  =  2.05X   •  200M  =  1.63X   •  50M  =  40%  fits  cache   •  100M  =  20%  fits  cache   •  200M  =  10%  fits  cache   •  What  is  the  maximum?  
  • 61. YCSB  100%  Read   3621   1308   2281   1111  1253   770   0   500   1000   1500   2000   2500   3000   3500   4000   BigBase  R1.0   HBase  0.94.15   Per  Server   50M   100M   200M   •  50M      =  2.77X   •  100M  =  2.05X   •  200M  =  1.63X   •  50M  =  40%  fits  cache   •  100M  =  20%  fits  cache   •  200M  =  10%  fits  cache   •  What  is  the  maximum?   •  ~  75X  (hotspot  2.5/100)   •  56K  (BB)  vs.  750  (HBase)   •  100%  in  cache  
  • 62. All  data  in  cache   •  Setup:  BigBase  1.0,  48G   RAM,  (8/16)  CPU  cores  –  5   nodes  (1+  4)   •  Data  set:  200M  (300GB)     •  Test:  Read  100%,  hotspot   •  YCSB  0.1.4  –  4  clients   •  40  threads  –      100K   •  100  threads  –  168K   •  200  threads  –  224K   •  400  threads  -­‐    262K     100,000   168,000   224,000   262,000   99%   1   2   3   7   95%   1   1   2   3   avg   0.4   0.6   0.9   1.5   0   1   2   3   4   5   6   7   8   Latency  (ms)   Hotspot  (2.5/100  –  200M  data)  
  • 63. All  data  in  cache   •  Setup:  BigBase  1.0,  48G   RAM,  (8/16)  CPU  cores  –  5   nodes  (1+  4)   •  Data  set:  200M  (300GB)     •  Test:  Read  100%,  hotspot   •  YCSB  0.1.4  –  4  clients   •  40  threads  –      100K   •  100  threads  –  168K   •  200  threads  –  224K   •  400  threads  -­‐    262K     100,000   168,000   224,000   262,000   99%   1   2   3   7   95%   1   1   2   3   avg   0.4   0.6   0.9   1.5   0   1   2   3   4   5   6   7   8   Latency  (ms)   Hotspot  (2.5/100  –  200M  data)   100K  ops:  99%  <  1ms  
  • 64. What  is  next?   •  Release  1.1  (2014  Q2)   –  Support  HBase  0.96,  0.98,  trunk   –  Fully  tested  L3  cache  (SSD)   •  Release  1.5  (2014  Q3)   –  YAMM:  memory  allocator  compacTng  mode  .   –  IntegraTon  with  Hadoop  metrics.   –  Row  Cache:  merge  rows  on  update  (good  for  counters).   –  Block  Cache:  new  evicTon  policy  (LRU-­‐2Q).   –  File  read  posix_fadvise  (  bypass  OS  page  cache).   –  Row  Cache:  make  it  available  for  server-­‐side  apps  
  • 65. What  is  next?     •  Release  2.0  (2014  Q3)   –  HBASE-­‐5263:  Preserving  cache  data  on  compacTon   –  Cache  data  blocks  on  memstore  flush  (configurable).     –  HBASE-­‐10648:  Pluggable  Memstore.  Off  heap   implementaTon,  based  on  Karma  (off  heap  BTree  lib).   •  Release  3.0  (2014  Q4)   –  Real  Scan  Cache  –    caches  results  of  Scan  operaTons  on   immutable  store  files.   –  Scan  Cache  integraTon  with  Phoenix  and  with  other  3rd   party  libs  provided    rich  query  API  for  HBase.      
  • 66. Download/Install/Uninstall   •  Download  BigBase  1.0  from  www.bigbase.org   •  InstallaTon/upgrade  takes  10-­‐20  minutes   •  BeaTficaTon  operator  EM(*)  is  inverTble:     HBase  =  EM-­‐1(BigBase)  (the  same  10-­‐20  min)  
  • 67. Q  &  A     Vladimir  Rodionov   Hadoop/HBase  architect   Founder  of  BigBase.org   HBase:  Extreme  makeover   Features  &  Internal  Track