HBaes The Difinitive Guide
Chapter 11 : Performance Tuning




                                  1
HBase The Definitive Guide
            http://shop.oreilly.com/product/0636920014348.do




                                                               2
chapter 11 : Performance Tuning

• Garbage Collection Tuning
• Memstore-Local Allocation Buffer
• Compression
• Optimizing Splits and Compactions
• Load Balancing
• Merging Regions
• Client API: Best Practices
• Configuration
• Load Tests




                                      3
Garbage Collection Tuning

• Young
          128MB   512MB
   ex. -XX:MaxNewSize=128m -XX:NewSize=128m            -Xmn128m

• Concurrent Mark-Sweep GC                               GC
   ex. -XX:+UseParNewGC and -XX:+UseConcMarkSweepGC

• Concurrent Mark-Sweep Collector (CMS)           GC


           70%         Concorrent Mark-Sweep GC

   ex. -XX:CMSInitiatingOccupancyFraction=70




                                                                  4
Memstore-Local Allocation Buffer

• HBase        Region                                                   Heap      fragmentation


• 0.90.0                                          MSLAB
                                                  Memstore            HDFS      flush
        MemStoreLAB
                                 hbase-site.xml
        hbase.hregion.memstore.mslab.enabled
        hbase.hregion.memstore.mslab.max.allocation
      0.90.2

•
      http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-
    allocation-buffers-part-1/
      http://www.slideshare.net/cloudera/hbase-hug-presentation




                                                                                                  5
Compression

• HBase                                     GZIP LZO Snappy




•
                                             major_compact


• HBase        Snappy
    http://dayafterneet.blogspot.com/2011/09/hbasesnappy.html




                                                                6
Optimizing Splits and Compactions

• Split

                         split   major_compact
     hbase.hregion.max.filesize    region
                                           ex.100GB



• Region Hotspotting
          Region                                      Region Hotspotting
   Key
     Sequential    ID   NG
     Random              Key




                                                                           7
Presplitting Regions

•                  Region




                            8
Load Balancing

• Resion

     hbase.balancer.period
   balancer          hbase.balancer.max.balancing
   hbase.balancer.period                            balancer




                                                               9
Merging Regions

• Resion          Region   Merge




                                   10
Client API: Best Practices

• Java Client API
   Disable auto-flush
                                 flushCommits()
   Use scanner-caching
     scan                                NG
   Limit scan scope
             family
   Close ResultScanners
   Block cache usage
   Optimal loading of row keys
     filter
   Turn off WAL on Puts
                                   ...




                                                 11
Configuration

•               HBase
    Decrease ZooKeeper timeout        zookeeper.session.timeout   ↓
    Increase handlers      hbase.regionserver.handler.count   ↑
    Increase heap settings ↑
    Enable data compression
    Increase region size     hbase.hregion.max.filesize   ↑
    Adjust block cache size     hfile.block.cache.size
    Adjust memstore limits
      hbase.regionserver.global.memstore.upperLimit
      hbase.region server.global.memstore.lowerLimit
    Increase blocking store files     hbase.hstore.blockingStoreFiles   ↑
    Increase block multiplier    hbase.hregion.memstore.block.multiplier   ↑
    Decrease maximum logfiles        hbase.regionserver.maxlogs    ↓




                                                                               12
Load Tests

• HBase      PerformanceEvaluation Tool




• YCSB       Load Test
                   http://dayafterneet.blogspot.com/2011/08/ycsbhbasemongodb.html

                                                                                    13
Hadoop/HBase

    hbase-env.sh         HBASE_HEAPSIZE                                16384       2000        HRegionServer    Java Heap


    core-site.xml        fs.inmemory.size.mb                           200         75          fs   in-memory       (MB)

                                                                                               SequenceFile
                         io.file.buffer.size                           131072


    hdfs-site.xml        dfs.namenode.handler.count                    50          10          NameNode

                                                                                               DataNode
                         dfs.datanode.max.xcievers                     8192        256


    hbase-site.xml       hbase.regionserver.handler.count              50          10          RegionServer


                         hbase.hregion.max.filesize                    536870912   268435456   HFile

                                                                                               HFile       BlockCache
                         hfile.block.cache.size                        0.3         0.2
                                                                                                                (0.2!20%)
                                                                                               HStore
                         hbase.hstore.blockingStoreFiles               10          7
                                                                                               BlockingStoreFile

                         hbase.hregion.memstore.mslab.enabled          TRUE        FALSE       mslab


                         hbase.hregion.memstore.mslab.chunksize        2097152     2097152     mslab    chunksize


                         hbase.hregion.memstore.mslab.max.allocation   1024768     262144      mslab




※                HBase      Chapter.11
                                                                                                                            14
table


                                                               lookup   BloomFilter
BLOOMFILTER                        ROW            NONE
                                                         (NONE/ROW/ROWCOL)




COMPRESSION                        SNAPPY         NONE
                                                          NONE/GZ/LZO/SNAPPY)




IN_MEMORY                          true/false
                                                                                HDFS




※             HBase   Chapter.11
                                                                                       15
- 100 threads
※1,500,000 records                         HBase(100 threads - 15 nodes)
※qps(query per second)                     HBase(100 threads / in_memory - 15 nodes)

100000 qps
 90000 qps
 80000 qps                                 86914

 70000 qps
 60000 qps
 50000 qps
 40000 qps                                                        43630
 30000 qps
                                   28125
 20000 qps               24943
 10000 qps      12468                                     12542
     0 qps
                     insert           select                 delete




                                                                                  16
- 100 threads
※1,500,000 records                        HBase(100 threads - 15 nodes)
※latency μs                               HBase(100 threads / in_memory - 15 nodes)

 20000 μs
 18000 μs
 16000 μs                16786
 14000 μs
 12000 μs
               11943                                    11841
 10000 μs
  8000 μs
  6000 μs
                                   6059
  4000 μs
  2000 μs                                                       2651
     0 μs                                 1156
                     insert          select                 delete




                                                                                 17
thread                          (in_memory)

※1,500,000 records
※qps(query per second)
                                       100   200       300   400
                                       500   600       700
350000 qps
300000 qps
250000 qps
200000 qps
150000 qps
100000 qps
 50000 qps
     0 qps
                     insert   select               delete




                                                                   18
thread                          (not in_memory)

※1,500,000 records
※qps(query per second)
                                       100   200       300     400
                                       500   600       700
350000 qps
300000 qps
250000 qps
200000 qps
150000 qps
100000 qps
 50000 qps
     0 qps
                     insert   select               delete




                                                                     19

HBase本輪読会資料(11章)

  • 1.
    HBaes The DifinitiveGuide Chapter 11 : Performance Tuning 1
  • 2.
    HBase The DefinitiveGuide http://shop.oreilly.com/product/0636920014348.do 2
  • 3.
    chapter 11 :Performance Tuning • Garbage Collection Tuning • Memstore-Local Allocation Buffer • Compression • Optimizing Splits and Compactions • Load Balancing • Merging Regions • Client API: Best Practices • Configuration • Load Tests 3
  • 4.
    Garbage Collection Tuning •Young 128MB 512MB ex. -XX:MaxNewSize=128m -XX:NewSize=128m -Xmn128m • Concurrent Mark-Sweep GC GC ex. -XX:+UseParNewGC and -XX:+UseConcMarkSweepGC • Concurrent Mark-Sweep Collector (CMS) GC 70% Concorrent Mark-Sweep GC ex. -XX:CMSInitiatingOccupancyFraction=70 4
  • 5.
    Memstore-Local Allocation Buffer •HBase Region Heap fragmentation • 0.90.0 MSLAB Memstore HDFS flush MemStoreLAB hbase-site.xml hbase.hregion.memstore.mslab.enabled hbase.hregion.memstore.mslab.max.allocation 0.90.2 • http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local- allocation-buffers-part-1/ http://www.slideshare.net/cloudera/hbase-hug-presentation 5
  • 6.
    Compression • HBase GZIP LZO Snappy • major_compact • HBase Snappy http://dayafterneet.blogspot.com/2011/09/hbasesnappy.html 6
  • 7.
    Optimizing Splits andCompactions • Split split major_compact hbase.hregion.max.filesize region ex.100GB • Region Hotspotting Region Region Hotspotting Key Sequential ID NG Random Key 7
  • 8.
  • 9.
    Load Balancing • Resion hbase.balancer.period balancer hbase.balancer.max.balancing hbase.balancer.period balancer 9
  • 10.
  • 11.
    Client API: BestPractices • Java Client API Disable auto-flush flushCommits() Use scanner-caching scan NG Limit scan scope family Close ResultScanners Block cache usage Optimal loading of row keys filter Turn off WAL on Puts ... 11
  • 12.
    Configuration • HBase Decrease ZooKeeper timeout zookeeper.session.timeout ↓ Increase handlers hbase.regionserver.handler.count ↑ Increase heap settings ↑ Enable data compression Increase region size hbase.hregion.max.filesize ↑ Adjust block cache size hfile.block.cache.size Adjust memstore limits hbase.regionserver.global.memstore.upperLimit hbase.region server.global.memstore.lowerLimit Increase blocking store files hbase.hstore.blockingStoreFiles ↑ Increase block multiplier hbase.hregion.memstore.block.multiplier ↑ Decrease maximum logfiles hbase.regionserver.maxlogs ↓ 12
  • 13.
    Load Tests • HBase PerformanceEvaluation Tool • YCSB Load Test http://dayafterneet.blogspot.com/2011/08/ycsbhbasemongodb.html 13
  • 14.
    Hadoop/HBase hbase-env.sh HBASE_HEAPSIZE 16384 2000 HRegionServer Java Heap core-site.xml fs.inmemory.size.mb 200 75 fs in-memory (MB) SequenceFile io.file.buffer.size 131072 hdfs-site.xml dfs.namenode.handler.count 50 10 NameNode DataNode dfs.datanode.max.xcievers 8192 256 hbase-site.xml hbase.regionserver.handler.count 50 10 RegionServer hbase.hregion.max.filesize 536870912 268435456 HFile HFile BlockCache hfile.block.cache.size 0.3 0.2 (0.2!20%) HStore hbase.hstore.blockingStoreFiles 10 7 BlockingStoreFile hbase.hregion.memstore.mslab.enabled TRUE FALSE mslab hbase.hregion.memstore.mslab.chunksize 2097152 2097152 mslab chunksize hbase.hregion.memstore.mslab.max.allocation 1024768 262144 mslab ※ HBase Chapter.11 14
  • 15.
    table lookup BloomFilter BLOOMFILTER ROW NONE (NONE/ROW/ROWCOL) COMPRESSION SNAPPY NONE NONE/GZ/LZO/SNAPPY) IN_MEMORY true/false HDFS ※ HBase Chapter.11 15
  • 16.
    - 100 threads ※1,500,000records HBase(100 threads - 15 nodes) ※qps(query per second) HBase(100 threads / in_memory - 15 nodes) 100000 qps 90000 qps 80000 qps 86914 70000 qps 60000 qps 50000 qps 40000 qps 43630 30000 qps 28125 20000 qps 24943 10000 qps 12468 12542 0 qps insert select delete 16
  • 17.
    - 100 threads ※1,500,000records HBase(100 threads - 15 nodes) ※latency μs HBase(100 threads / in_memory - 15 nodes) 20000 μs 18000 μs 16000 μs 16786 14000 μs 12000 μs 11943 11841 10000 μs 8000 μs 6000 μs 6059 4000 μs 2000 μs 2651 0 μs 1156 insert select delete 17
  • 18.
    thread (in_memory) ※1,500,000 records ※qps(query per second) 100 200 300 400 500 600 700 350000 qps 300000 qps 250000 qps 200000 qps 150000 qps 100000 qps 50000 qps 0 qps insert select delete 18
  • 19.
    thread (not in_memory) ※1,500,000 records ※qps(query per second) 100 200 300 400 500 600 700 350000 qps 300000 qps 250000 qps 200000 qps 150000 qps 100000 qps 50000 qps 0 qps insert select delete 19