SlideShare a Scribd company logo
HBase	
  Cache	
  &	
  Performance	
  
Biju	
  Nair	
  
Boston	
  Hadoop	
  User	
  Group	
  Meet-­‐up	
  
28	
  May	
  2015	
  
HBase	
  Overview	
  
•  Key	
  value	
  store	
  
•  Column	
  family	
  oriented	
  
•  Data	
  stored	
  as	
  byte[]	
  
•  Data	
  indexed	
  by	
  key	
  value	
  
•  Data	
  stored	
  in	
  sorted	
  order	
  by	
  key	
  
•  Data	
  model	
  doesn’t	
  have	
  to	
  be	
  pre-­‐defined	
  
•  Scales	
  horizontally	
  
2	
  
HBase	
  Overview	
  
create	
  ‘stock’,	
  ‘company’,	
  ‘financials’	
  
3	
  
…
msft,company,loc,ts1,Seattle
msft,company,name,ts1,Microsoft
…
orcl,company,loc,ts1,Redwood
orcl,company,name,ts1,Oracle
…
…
msft,financials,cap,ts1,379B
msft,financials,pe,ts1,20
…
orcl,financials,cap,ts1,190B
orcl,financials,pe,ts1,18
…
Physical	
  Storage	
  
put	
  ‘stock’,	
  ’ms9’,	
  ‘company:name’,	
  ‘microso9’	
  
get	
  ‘stock’,	
  ’ms9’	
  
company:loc,ts1,Seattle
company:name,ts1,Microsoft
financials:cap,ts1,379B
financials:PE,ts1,20
HBase	
  Overview	
  
4	
  
…
appl,company…
…
ge,company…
…
ibm,company…
…
msft,company…
msft,company…
…
orcl,company…
orcl,company…
…
…
appl,financials…
…
ge,financials…
…
ibm,financials…
…
msft,financials…
msft,financials…
…
orcl,financials…
orcl,financials…
…
…
appl,company…
…
…
ge,company…
…
…
ibm,company…
…
…
msft,company…
…
…
orcl,company…
…
…
appl,financials…
…
…
ge,financials…
…
…
ibm,financials…
…
…
msft,financials…
…
…
orcl,financials…
…
Regions	
  
HBase	
  Overview	
  
5	
  
Region	
  Server	
  Region	
  Server	
  Region	
  Server	
  Region	
  Server	
  Region	
  Server	
  
…
appl,company…
…
…
ge,company…
…
…
ibm,company…
…
…
msft,company…
…
…
orcl,company…
…
HBase	
  Master	
  
ZooKeeper	
  
Client	
  
Use	
  Case:	
  Data	
  and	
  Query	
  
•  Time	
  series	
  data	
  
– Tickers	
  and	
  aYributes	
  
– Monthly	
  data	
  stored	
  in	
  a	
  column;	
  256	
  bytes	
  
– Up	
  to	
  20	
  years	
  worth	
  of	
  data	
  
•  Queries	
  
– “get”s	
  for	
  up	
  to	
  1	
  year	
  data;	
  3072	
  bytes	
  
6	
  
Use	
  Case:	
  Requirements	
  
•  Meet	
  “get”	
  query	
  performance	
  requirements	
  
– Under	
  10	
  ms	
  for	
  99%	
  of	
  queries	
  
– Median	
  latency	
  2	
  to	
  3	
  ms	
  
– 99.99%	
  latency	
  under	
  50	
  ms	
  
•  Efficient	
  HBase	
  cluster	
  capacity	
  uelizaeon	
  
– 32	
  cores	
  per	
  node	
  
– 128	
  GB	
  of	
  memory	
  per	
  node	
  
– SSD	
  storage	
  in	
  all	
  nodes	
  
7	
  
Baseline	
  Test	
  Observaeons	
  
•  Spikes	
  in	
  read	
  response	
  emes	
  
•  Less	
  than	
  10%	
  uelizaeon	
  of	
  RS	
  node	
  CPUs	
  
•  Less	
  than	
  15%	
  uelizaeon	
  of	
  RS	
  node	
  memory	
  
•  Block	
  cache	
  uelizaeon	
  was	
  inefficient	
  
– Low	
  hit	
  raeo	
  and	
  high	
  eviceon	
  rates	
  
8	
  
HBase	
  Internals	
  (Simplified)	
  
HBase	
  Memory	
  (RS)	
  
Mem	
  Store	
  
Block	
  cache	
  
HBase	
  Storage	
  
WAL	
  
HFiles	
  
9	
  
HBase	
  Write	
  Path	
  (Simplified)	
  
HBase	
  Memory	
  (RS)	
  
Mem	
  Store	
  
Block	
  cache	
  
HBase	
  Storage	
  
WAL	
  
HFiles	
  
1
10	
  
3	
  
2
HBase	
  Read	
  Path	
  (Simplified)	
  
HBase	
  Memory	
  (RS)	
  
Mem	
  Store	
  
Block	
  cache	
  
HBase	
  Storage	
  
WAL	
  
HFiles	
  
1
2
11	
  
Cache	
  Uelizaeon	
  
•  Low	
  hit	
  raeo	
  and	
  high	
  eviceon	
  rates	
  
•  Frequently	
  read	
  data	
  size	
  
– ~	
  3	
  K	
  
•  Table	
  block	
  size	
  
– 65	
  K	
  
•  Proposed	
  change	
  
– Reduce	
  block	
  size	
  
12	
  
Impact	
  of	
  Table	
  Blk	
  Size	
  Change	
  
Avg 3.002 5.362 5.361 5.357 6.419 6.369 6.405 6.383 6.188 6.196 6.182 6.174 6.246 6.264 6.268 6.253 5.194 5.207 5.219 3.031
Median 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
95% 10 15 15 15 18 18 18 18 18 18 17 17 18 18 18 18 15 15 15 10
99% 15 26 26 26 30 30 30 30 28 28 28 28 29 29 29 29 25 24 25 15
99.90% 26 41 41 41 45 45 45 45 43 43 43 43 44 44 44 44 41 41 41 26
Max 2261 127 185 102 90 106 92 102 93 106 119 114 89 140 132 82 81 150 93 1910
BAvg 16.731 16.728 16.761 16.763 16.418 16.371 16.37 16.431 16.152 16.14 16.169 16.158 16.308 16.29 16.325 16.307 16.34 16.381 16.391 16.352
BMedian 14 14 14 14 13 13 13 13 15 15 15 15 13 13 13 13 13 13 13 13
B95% 41 41 41 41 41 41 41 41 43 43 43 43 40 40 40 40 41 41 41 41
B99% 55 55 55 55 54 54 54 54 55 55 55 55 54 54 54 54 54 54 55 54
B99.9% 71 71 71 71 70 70 70 70 67 67 67 67 71 70 70 71 71 71 71 70
BMax 545 1062 559 567 1075 1027 561 567 564 541 558 1062 1062 561 1075 1072 1067 563 1035 1032
Get	
  Performance	
  (ms)	
  –	
  64	
  K	
  Blk	
  
Get	
  Performance	
  (ms)	
  –	
  16	
  K	
  Blk	
  
Note:	
  Smaller	
  block	
  size	
  increases	
  the	
  overhead	
  of	
  increased	
  index	
  blocks	
  	
  
13	
  
Memory	
  uelizaeon/Latency	
  Spikes	
  
•  JVM	
  GC	
  contributed	
  to	
  latency	
  spikes	
  
•  Increase	
  in	
  heap	
  size	
  increased	
  GC	
  eme	
  
– Prevented	
  using	
  all	
  the	
  available	
  memory	
  	
  	
  
•  Proposed	
  change:	
  Use	
  off-­‐heap	
  caching	
  
– Minimize	
  spikes	
  in	
  response	
  eme	
  due	
  to	
  GC	
  
– Increased	
  uelizaeon	
  of	
  node	
  memory	
  
14	
  
HBase	
  Off-­‐Heap	
  Caching	
  
HBase	
  Memory	
  (RS)	
  
Mem	
  Store	
  
Block	
  cache	
  (L1)	
  Idx	
  &	
  BF	
  data	
  
HBase	
  Storage	
  
WAL	
  
HFiles	
  
Off-­‐heap	
  cache	
  (L2)	
  Tbl	
  Data	
  
(Bucket	
  Cache)	
  
15	
  
HBase	
  Read	
  Path	
  (Simplified)	
  
HBase	
  Memory	
  (RS)	
  
Mem	
  Store	
  
Block	
  cache	
  
HBase	
  Storage	
  
WAL	
  
HFile	
  
1
2
L2	
  Cache	
  
3
4
16	
  
Bucket	
  Cache	
  Configuraeon	
  
•  Hbase env.sh HBASE_REGIONSERVER_OPTS
parameters
–  Xmx
–  XX:MaxDirectMemorySize
•  Hbase site.xml properties
–  hbase.regionserver.global.memstore.upperLimit
–  hfile.block.cache.size
–  hbase.bucketcache.size
–  hbase.bucketcache.ioengine
–  hbase.bucketcache.percentage.in.combinedcache	
  
17	
  
Bucket	
  Cache	
  Configuraeon	
  
Item	
   id	
   Values	
  
Total	
  RS	
  memory	
   Tot	
  
Memstore	
  size	
   MSz	
  
L1	
  (LRU)	
  Cache	
   L1Sz	
  
Heap	
  for	
  JVM	
   JHSz	
  
XX:MaxDirectMemorySize	
   DMem	
   Tot-­‐MSz-­‐L1Sz-­‐JHSz	
  
Xmx	
   Xmx	
   MSz+L1Sz+JHSz	
  
hbase.regionserver.global.memstore.upperLimit	
   ULim	
   MSz/Xmx	
  
hfile.block.cache.size	
   blksz	
   0.8-­‐ULim	
  
hbase.bucketcache.size	
   bucsz	
   Dmem+(blksz*Xmx)	
  
hbase.bucketcache.percentage.in.combinedcache	
   ccsz	
   1-­‐((blksz*Xmx)/bucsz))	
  
hbase.bucketcache.ioengine	
   Oueap/”file:/localfile”	
  
18	
  
Bucket	
  Cache	
  Configuraeon	
  
Item	
   id	
   Values	
  
Total	
  RS	
  memory	
   Tot	
   96000	
  
Memstore	
  size	
   MSz	
   2000	
  
L1	
  (LRU)	
  Cache	
   L1Sz	
   2000	
  
Heap	
  for	
  JVM	
   JHSz	
   1000	
  
XX:MaxDirectMemorySize	
   DMem	
   91000	
  
Xmx	
   Xmx	
   5000	
  
hbase.regionserver.global.memstore.upperLimit	
   ULim	
   0.4	
  
hfile.block.cache.size	
   blksz	
   0.4	
  
hbase.bucketcache.size	
   bucsz	
   93000	
  
hbase.bucketcache.percentage.in.combinedcache	
   ccsz	
   0.97849	
  
hbase.bucketcache.ioengine	
   ”file:/localfile”	
  
19	
  
Impact	
  of	
  Using	
  Off-­‐Heap	
  Cache	
  
Get	
  Performance	
  with	
  L1	
  cache	
  	
  
Get	
  Performance	
  with	
  L1	
  &	
  L2	
  cache	
  
Note:	
  L1	
  cache	
  test	
  used	
  38	
  GB	
  of	
  data,	
  L1+L2	
  test	
  used	
  3	
  TB	
  of	
  data	
  	
  
Avg 3.872 3.995 3.936 4.007 4.052
Median 1 1 1 1 1
95% 14 14 14 15 15
99% 20 20 20 20 20
99.90% 27 27 27 28 28
99.99% 36 36 36 37 37
99.999% 208 310 332 207 232
Max 1360 1906 1736 1359 1363
807Mil797107Mil7Requests
BAvg 3.429 2.552 3.447 3.502 3.554
BMedian 2 2 2 2 2
B95% 10 8 10 10 10
B99% 18 14 18 18 18
B99.9% 30 23 30 30 31
BMax 78 1135 58 77 67
18Mil8Rows8>818Mil8Requests
20	
  
Maximize	
  CPU	
  &	
  Memory	
  Uelizaeon	
  
•  Run	
  addieonal	
  RS	
  per	
  node	
  
•  Throughput	
  increased	
  50%	
  when	
  RS	
  increased	
  
to	
  2	
  
– Through	
  put	
  reduced	
  on	
  AWS	
  cluster	
  
– There	
  was	
  no	
  degradaeon	
  on	
  the	
  response	
  eme	
  
– Through	
  put	
  increase	
  tapered	
  awer	
  3	
  RS	
  per	
  node	
  
•  Note:	
  Maintenance	
  over	
  head	
  using	
  mule-­‐RS	
  
21	
  
Known	
  Issues	
  
•  Using	
  “oueap”	
  opeon	
  of	
  BucketCache	
  prevents	
  
RS	
  start	
  
–  [HBASE-­‐10643]	
  
–  Can	
  be	
  miegated	
  using	
  tempfs	
  
•  LoadIncrementalHFiles	
  doesn’t	
  work	
  with	
  
BucketCache	
  	
  
–  [HBase-­‐10500]	
  
•  BucketCache	
  for	
  different	
  block	
  sizes	
  is	
  not	
  
configurable	
  
–  [HBASE-­‐10641]	
  Fixed	
  
22	
  
Key	
  Takeaways	
  
•  Store	
  what	
  is	
  really	
  required	
  
– Understand	
  the	
  query	
  paYern	
  
– Leverage	
  column	
  family	
  (CF)	
  to	
  group	
  data	
  
•  Choose	
  appropriate	
  block	
  size	
  for	
  table/CF	
  
•  Use	
  off	
  heap	
  cache	
  to	
  minimize	
  latency	
  spikes	
  
•  Test	
  all	
  assumpeons	
  
23	
  
Further	
  Reading	
  
•  hYp://blog.asquareb.com/blog/2014/11/21/leverage-­‐hbase-­‐cache-­‐and-­‐
improve-­‐read-­‐performance	
  
•  hYp://blog.asquareb.com/blog/2014/11/24/how-­‐to-­‐leverage-­‐large-­‐
physical-­‐memory-­‐to-­‐improve-­‐hbase-­‐read-­‐performance	
  
•  hYps://issues.apache.org/jira/browse/HBASE-­‐7404	
  
•  hYp://www.n10k.com/blog/blockcache-­‐101/	
  	
  
•  hYp://www.n10k.com/blog/blockcache-­‐showdown/	
  
24	
  
25	
  
bnair@asquareb.com
blog.asquareb.com
https://github.com/bijugs
@gsbiju

More Related Content

What's hot

Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
DataWorks Summit
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
Benjamin Leonhardi
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
Maxscale switchover, failover, and auto rejoin
Maxscale switchover, failover, and auto rejoinMaxscale switchover, failover, and auto rejoin
Maxscale switchover, failover, and auto rejoin
Wagner Bianchi
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
Chandler Huang
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Comparing Accumulo, Cassandra, and HBase
Comparing Accumulo, Cassandra, and HBaseComparing Accumulo, Cassandra, and HBase
Comparing Accumulo, Cassandra, and HBase
Accumulo Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
HBaseCon
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Databricks
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Dremio Corporation
 
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted MalaskaTop 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Spark Summit
 
Apache spark 소개 및 실습
Apache spark 소개 및 실습Apache spark 소개 및 실습
Apache spark 소개 및 실습
동현 강
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
MIJIN AN
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Cloudera, Inc.
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 

What's hot (20)

Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Maxscale switchover, failover, and auto rejoin
Maxscale switchover, failover, and auto rejoinMaxscale switchover, failover, and auto rejoin
Maxscale switchover, failover, and auto rejoin
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Comparing Accumulo, Cassandra, and HBase
Comparing Accumulo, Cassandra, and HBaseComparing Accumulo, Cassandra, and HBase
Comparing Accumulo, Cassandra, and HBase
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted MalaskaTop 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
 
Apache spark 소개 및 실습
Apache spark 소개 및 실습Apache spark 소개 및 실습
Apache spark 소개 및 실습
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 

Viewers also liked

Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL Database
DataWorks Summit
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
Cloudera, Inc.
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
Schubert Zhang
 
HBase 훑어보기
HBase 훑어보기HBase 훑어보기
HBase 훑어보기
beom kyun choi
 
HBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to CoprocessorsHBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to Coprocessors
Cloudera, Inc.
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
Biju Nair
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
Biju Nair
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk Management
Biju Nair
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
Biju Nair
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
Biju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
Biju Nair
 
Concurrency
ConcurrencyConcurrency
Concurrency
Biju Nair
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
Biju Nair
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
Biju Nair
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
StampedeCon
 

Viewers also liked (15)

Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL Database
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
HBase 훑어보기
HBase 훑어보기HBase 훑어보기
HBase 훑어보기
 
HBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to CoprocessorsHBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to Coprocessors
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk Management
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
 

Similar to HBase Application Performance Improvement

HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
DataWorks Summit/Hadoop Summit
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
HBaseCon
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
HBaseCon
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Rongze Zhu
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
Amazon Web Services
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
Joao Galdino Mello de Souza
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
Venu Anuganti
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
Yongseok Oh
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
Wim Godden
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
Michael Stack
 
Sql server scalability fundamentals
Sql server scalability fundamentalsSql server scalability fundamentals
Sql server scalability fundamentals
Chris Adkin
 
Deploying ssd in the data center 2014
Deploying ssd in the data center 2014Deploying ssd in the data center 2014
Deploying ssd in the data center 2014
Howard Marks
 
HBase: Extreme makeover
HBase: Extreme makeoverHBase: Extreme makeover
HBase: Extreme makeover
bigbase
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
Aaron Joue
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
Istvan Szukacs
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
Istvan Szukacs
 
FlashSQL 소개 & TechTalk
FlashSQL 소개 & TechTalkFlashSQL 소개 & TechTalk
FlashSQL 소개 & TechTalk
I Goo Lee
 
S016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710dS016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710d
Tony Pearson
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Community
 
VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld
 

Similar to HBase Application Performance Improvement (20)

HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
 
Sql server scalability fundamentals
Sql server scalability fundamentalsSql server scalability fundamentals
Sql server scalability fundamentals
 
Deploying ssd in the data center 2014
Deploying ssd in the data center 2014Deploying ssd in the data center 2014
Deploying ssd in the data center 2014
 
HBase: Extreme makeover
HBase: Extreme makeoverHBase: Extreme makeover
HBase: Extreme makeover
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
 
FlashSQL 소개 & TechTalk
FlashSQL 소개 & TechTalkFlashSQL 소개 & TechTalk
FlashSQL 소개 & TechTalk
 
S016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710dS016827 pendulum-swings-nola-v1710d
S016827 pendulum-swings-nola-v1710d
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash
 

More from Biju Nair

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
Biju Nair
 
HBase Internals And Operations
HBase Internals And OperationsHBase Internals And Operations
HBase Internals And Operations
Biju Nair
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka Reference
Biju Nair
 
Serving queries at low latency using HBase
Serving queries at low latency using HBaseServing queries at low latency using HBase
Serving queries at low latency using HBase
Biju Nair
 
Multi-Tenant HBase Cluster - HBaseCon2018-final
Multi-Tenant HBase Cluster - HBaseCon2018-finalMulti-Tenant HBase Cluster - HBaseCon2018-final
Multi-Tenant HBase Cluster - HBaseCon2018-final
Biju Nair
 
Cursor Implementation in Apache Phoenix
Cursor Implementation in Apache PhoenixCursor Implementation in Apache Phoenix
Cursor Implementation in Apache Phoenix
Biju Nair
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
Biju Nair
 
Chef patterns
Chef patternsChef patterns
Chef patterns
Biju Nair
 

More from Biju Nair (8)

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
 
HBase Internals And Operations
HBase Internals And OperationsHBase Internals And Operations
HBase Internals And Operations
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka Reference
 
Serving queries at low latency using HBase
Serving queries at low latency using HBaseServing queries at low latency using HBase
Serving queries at low latency using HBase
 
Multi-Tenant HBase Cluster - HBaseCon2018-final
Multi-Tenant HBase Cluster - HBaseCon2018-finalMulti-Tenant HBase Cluster - HBaseCon2018-final
Multi-Tenant HBase Cluster - HBaseCon2018-final
 
Cursor Implementation in Apache Phoenix
Cursor Implementation in Apache PhoenixCursor Implementation in Apache Phoenix
Cursor Implementation in Apache Phoenix
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Chef patterns
Chef patternsChef patterns
Chef patterns
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

HBase Application Performance Improvement

  • 1. HBase  Cache  &  Performance   Biju  Nair   Boston  Hadoop  User  Group  Meet-­‐up   28  May  2015  
  • 2. HBase  Overview   •  Key  value  store   •  Column  family  oriented   •  Data  stored  as  byte[]   •  Data  indexed  by  key  value   •  Data  stored  in  sorted  order  by  key   •  Data  model  doesn’t  have  to  be  pre-­‐defined   •  Scales  horizontally   2  
  • 3. HBase  Overview   create  ‘stock’,  ‘company’,  ‘financials’   3   … msft,company,loc,ts1,Seattle msft,company,name,ts1,Microsoft … orcl,company,loc,ts1,Redwood orcl,company,name,ts1,Oracle … … msft,financials,cap,ts1,379B msft,financials,pe,ts1,20 … orcl,financials,cap,ts1,190B orcl,financials,pe,ts1,18 … Physical  Storage   put  ‘stock’,  ’ms9’,  ‘company:name’,  ‘microso9’   get  ‘stock’,  ’ms9’   company:loc,ts1,Seattle company:name,ts1,Microsoft financials:cap,ts1,379B financials:PE,ts1,20
  • 4. HBase  Overview   4   … appl,company… … ge,company… … ibm,company… … msft,company… msft,company… … orcl,company… orcl,company… … … appl,financials… … ge,financials… … ibm,financials… … msft,financials… msft,financials… … orcl,financials… orcl,financials… … … appl,company… … … ge,company… … … ibm,company… … … msft,company… … … orcl,company… … … appl,financials… … … ge,financials… … … ibm,financials… … … msft,financials… … … orcl,financials… … Regions  
  • 5. HBase  Overview   5   Region  Server  Region  Server  Region  Server  Region  Server  Region  Server   … appl,company… … … ge,company… … … ibm,company… … … msft,company… … … orcl,company… … HBase  Master   ZooKeeper   Client  
  • 6. Use  Case:  Data  and  Query   •  Time  series  data   – Tickers  and  aYributes   – Monthly  data  stored  in  a  column;  256  bytes   – Up  to  20  years  worth  of  data   •  Queries   – “get”s  for  up  to  1  year  data;  3072  bytes   6  
  • 7. Use  Case:  Requirements   •  Meet  “get”  query  performance  requirements   – Under  10  ms  for  99%  of  queries   – Median  latency  2  to  3  ms   – 99.99%  latency  under  50  ms   •  Efficient  HBase  cluster  capacity  uelizaeon   – 32  cores  per  node   – 128  GB  of  memory  per  node   – SSD  storage  in  all  nodes   7  
  • 8. Baseline  Test  Observaeons   •  Spikes  in  read  response  emes   •  Less  than  10%  uelizaeon  of  RS  node  CPUs   •  Less  than  15%  uelizaeon  of  RS  node  memory   •  Block  cache  uelizaeon  was  inefficient   – Low  hit  raeo  and  high  eviceon  rates   8  
  • 9. HBase  Internals  (Simplified)   HBase  Memory  (RS)   Mem  Store   Block  cache   HBase  Storage   WAL   HFiles   9  
  • 10. HBase  Write  Path  (Simplified)   HBase  Memory  (RS)   Mem  Store   Block  cache   HBase  Storage   WAL   HFiles   1 10   3   2
  • 11. HBase  Read  Path  (Simplified)   HBase  Memory  (RS)   Mem  Store   Block  cache   HBase  Storage   WAL   HFiles   1 2 11  
  • 12. Cache  Uelizaeon   •  Low  hit  raeo  and  high  eviceon  rates   •  Frequently  read  data  size   – ~  3  K   •  Table  block  size   – 65  K   •  Proposed  change   – Reduce  block  size   12  
  • 13. Impact  of  Table  Blk  Size  Change   Avg 3.002 5.362 5.361 5.357 6.419 6.369 6.405 6.383 6.188 6.196 6.182 6.174 6.246 6.264 6.268 6.253 5.194 5.207 5.219 3.031 Median 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 95% 10 15 15 15 18 18 18 18 18 18 17 17 18 18 18 18 15 15 15 10 99% 15 26 26 26 30 30 30 30 28 28 28 28 29 29 29 29 25 24 25 15 99.90% 26 41 41 41 45 45 45 45 43 43 43 43 44 44 44 44 41 41 41 26 Max 2261 127 185 102 90 106 92 102 93 106 119 114 89 140 132 82 81 150 93 1910 BAvg 16.731 16.728 16.761 16.763 16.418 16.371 16.37 16.431 16.152 16.14 16.169 16.158 16.308 16.29 16.325 16.307 16.34 16.381 16.391 16.352 BMedian 14 14 14 14 13 13 13 13 15 15 15 15 13 13 13 13 13 13 13 13 B95% 41 41 41 41 41 41 41 41 43 43 43 43 40 40 40 40 41 41 41 41 B99% 55 55 55 55 54 54 54 54 55 55 55 55 54 54 54 54 54 54 55 54 B99.9% 71 71 71 71 70 70 70 70 67 67 67 67 71 70 70 71 71 71 71 70 BMax 545 1062 559 567 1075 1027 561 567 564 541 558 1062 1062 561 1075 1072 1067 563 1035 1032 Get  Performance  (ms)  –  64  K  Blk   Get  Performance  (ms)  –  16  K  Blk   Note:  Smaller  block  size  increases  the  overhead  of  increased  index  blocks     13  
  • 14. Memory  uelizaeon/Latency  Spikes   •  JVM  GC  contributed  to  latency  spikes   •  Increase  in  heap  size  increased  GC  eme   – Prevented  using  all  the  available  memory       •  Proposed  change:  Use  off-­‐heap  caching   – Minimize  spikes  in  response  eme  due  to  GC   – Increased  uelizaeon  of  node  memory   14  
  • 15. HBase  Off-­‐Heap  Caching   HBase  Memory  (RS)   Mem  Store   Block  cache  (L1)  Idx  &  BF  data   HBase  Storage   WAL   HFiles   Off-­‐heap  cache  (L2)  Tbl  Data   (Bucket  Cache)   15  
  • 16. HBase  Read  Path  (Simplified)   HBase  Memory  (RS)   Mem  Store   Block  cache   HBase  Storage   WAL   HFile   1 2 L2  Cache   3 4 16  
  • 17. Bucket  Cache  Configuraeon   •  Hbase env.sh HBASE_REGIONSERVER_OPTS parameters –  Xmx –  XX:MaxDirectMemorySize •  Hbase site.xml properties –  hbase.regionserver.global.memstore.upperLimit –  hfile.block.cache.size –  hbase.bucketcache.size –  hbase.bucketcache.ioengine –  hbase.bucketcache.percentage.in.combinedcache   17  
  • 18. Bucket  Cache  Configuraeon   Item   id   Values   Total  RS  memory   Tot   Memstore  size   MSz   L1  (LRU)  Cache   L1Sz   Heap  for  JVM   JHSz   XX:MaxDirectMemorySize   DMem   Tot-­‐MSz-­‐L1Sz-­‐JHSz   Xmx   Xmx   MSz+L1Sz+JHSz   hbase.regionserver.global.memstore.upperLimit   ULim   MSz/Xmx   hfile.block.cache.size   blksz   0.8-­‐ULim   hbase.bucketcache.size   bucsz   Dmem+(blksz*Xmx)   hbase.bucketcache.percentage.in.combinedcache   ccsz   1-­‐((blksz*Xmx)/bucsz))   hbase.bucketcache.ioengine   Oueap/”file:/localfile”   18  
  • 19. Bucket  Cache  Configuraeon   Item   id   Values   Total  RS  memory   Tot   96000   Memstore  size   MSz   2000   L1  (LRU)  Cache   L1Sz   2000   Heap  for  JVM   JHSz   1000   XX:MaxDirectMemorySize   DMem   91000   Xmx   Xmx   5000   hbase.regionserver.global.memstore.upperLimit   ULim   0.4   hfile.block.cache.size   blksz   0.4   hbase.bucketcache.size   bucsz   93000   hbase.bucketcache.percentage.in.combinedcache   ccsz   0.97849   hbase.bucketcache.ioengine   ”file:/localfile”   19  
  • 20. Impact  of  Using  Off-­‐Heap  Cache   Get  Performance  with  L1  cache     Get  Performance  with  L1  &  L2  cache   Note:  L1  cache  test  used  38  GB  of  data,  L1+L2  test  used  3  TB  of  data     Avg 3.872 3.995 3.936 4.007 4.052 Median 1 1 1 1 1 95% 14 14 14 15 15 99% 20 20 20 20 20 99.90% 27 27 27 28 28 99.99% 36 36 36 37 37 99.999% 208 310 332 207 232 Max 1360 1906 1736 1359 1363 807Mil797107Mil7Requests BAvg 3.429 2.552 3.447 3.502 3.554 BMedian 2 2 2 2 2 B95% 10 8 10 10 10 B99% 18 14 18 18 18 B99.9% 30 23 30 30 31 BMax 78 1135 58 77 67 18Mil8Rows8>818Mil8Requests 20  
  • 21. Maximize  CPU  &  Memory  Uelizaeon   •  Run  addieonal  RS  per  node   •  Throughput  increased  50%  when  RS  increased   to  2   – Through  put  reduced  on  AWS  cluster   – There  was  no  degradaeon  on  the  response  eme   – Through  put  increase  tapered  awer  3  RS  per  node   •  Note:  Maintenance  over  head  using  mule-­‐RS   21  
  • 22. Known  Issues   •  Using  “oueap”  opeon  of  BucketCache  prevents   RS  start   –  [HBASE-­‐10643]   –  Can  be  miegated  using  tempfs   •  LoadIncrementalHFiles  doesn’t  work  with   BucketCache     –  [HBase-­‐10500]   •  BucketCache  for  different  block  sizes  is  not   configurable   –  [HBASE-­‐10641]  Fixed   22  
  • 23. Key  Takeaways   •  Store  what  is  really  required   – Understand  the  query  paYern   – Leverage  column  family  (CF)  to  group  data   •  Choose  appropriate  block  size  for  table/CF   •  Use  off  heap  cache  to  minimize  latency  spikes   •  Test  all  assumpeons   23  
  • 24. Further  Reading   •  hYp://blog.asquareb.com/blog/2014/11/21/leverage-­‐hbase-­‐cache-­‐and-­‐ improve-­‐read-­‐performance   •  hYp://blog.asquareb.com/blog/2014/11/24/how-­‐to-­‐leverage-­‐large-­‐ physical-­‐memory-­‐to-­‐improve-­‐hbase-­‐read-­‐performance   •  hYps://issues.apache.org/jira/browse/HBASE-­‐7404   •  hYp://www.n10k.com/blog/blockcache-­‐101/     •  hYp://www.n10k.com/blog/blockcache-­‐showdown/   24