Successfully reported this slideshow.
Your SlideShare is downloading. ×

Supporting Apache HBase : Troubleshooting and Supportability Improvements

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 47 Ad
Advertisement

More Related Content

Slideshows for you (20)

Similar to Supporting Apache HBase : Troubleshooting and Supportability Improvements (20)

Advertisement

More from DataWorks Summit (20)

Recently uploaded (20)

Advertisement

Supporting Apache HBase : Troubleshooting and Supportability Improvements

  1. 1. Supporting Apache HBase Troubleshooting and Supportability Improvements
  2. 2. 2© Cloudera, Inc. All rights reserved. Who we are • Daisuke Kobayashi (d1ce_) • Customer support at Cloudera since 2012, focusing on HDFS and HBase specifically • Apache HBase contributor • Toshihiro Suzuki (brfrn169) • Apache HBase committer since 2018 • Sr. Software Engineer, Breakfix (HBase/Phoenix, HDFS) at Cloudera • Wrote and Published a book based on HBase for beginners in Japanese
  3. 3. 3© Cloudera, Inc. All rights reserved. Supporting HBase • Typical Troubleshooting Scenario with HBase • Fix performance degradation (Slowness) • Identify the reason of process being crashed • Fix inconsistencies
  4. 4. 4© Cloudera, Inc. All rights reserved. Agenda • General approach to HBase performance issues with existing tools • htop - Real-time monitoring tool for HBase
  5. 5. © Cloudera, Inc. All rights reserved. General approach to HBase performance issues with existing tools (Logs and metrics are strictly aligned to HBase 2.1 (CDH 6.2)
  6. 6. 6 © Cloudera, Inc. All rights reserved. • Performance issues are tough! • Typical reasons • “Hot Spot” Region • Region with Non-Local Data • Excessive I/O Wait Due To Swapping Or An Over-Worked Or Failing Hard Disk • Stop the world with long GC pauses in RegionServers • Slowness Due To High Processor Usage • Network Saturation, etc. • Source of truth • Logs (a lot!) • Metrics (a lot!) Troubleshooting Performance Issues
  7. 7. 7© Cloudera, Inc. All rights reserved. Approach to Performance Troubleshooting Source - https://www.slideshare.net/brendangregg/velocity-2015-linux-perf-tools • Understanding the issue • Top-down • USE Method (epecifically, focusing on U and S in this talk)
  8. 8. 8© Cloudera, Inc. All rights reserved. Resources and Observability in RegionServer MemStoreBlockCache RPC System (Handlers / Queues) HDFS Client
  9. 9. 9© Cloudera, Inc. All rights reserved. Resources and Observability in RegionServer RPC System (Handlers / Queues) HDFS Client MemStoreBlockCache
  10. 10. 10© Cloudera, Inc. All rights reserved. Resources and Observability in RegionServer RPC System (Handlers / Queues) HDFS Client Cache Size Cache Eviction Ratio Flush Size Frequency of requests Memstore Size Frequency of flush RPC Processed Time, Queue Length & Time Flush Queue MemStoreBlockCache Frequency of blocking updates
  11. 11. 11© Cloudera, Inc. All rights reserved. RPC System (Handlers / Queues) HDFS Client MemStoreBlockCache
  12. 12. 12© Cloudera, Inc. All rights reserved. RPC System Utilization & Saturation • Numer of RPC requests • Incremented by one by the following actions at the RPC server level • doReplayBatchOp, closeRegion, compactRegion, flushRegion, getOnlineRegion, getRegionInfo, getServerInfo, openRegion, rollWALWriter, bulkLoadHFile, prepareBulkLoad, get, multi, mutate, scan "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server", "totalRequestCount" : 167130, HBASE-21207 made the columns sortable! Master webui Raw metrics
  13. 13. 13© Cloudera, Inc. All rights reserved. RPC System Utilization & Saturation • RPC queue length & request size "name" : "Hadoop:service=HBase,name=RegionServer,sub=IPC", "queueSize" : 619211, "numCallsInGeneralQueue" : 5, "numCallsInPriorityQueue" : 0, Queue for hight priority handlers to deal with admin requests and system table operation requests. # of handler is controlled by hbase.regionserver.metahandler.count Queue for normal handlers. # of handler is controlled by hbase.regionserver.handler.count Running count of the size in bytes of all outstanding calls whether currently executing or queued waiting to be run. RegionServer webui
  14. 14. 14© Cloudera, Inc. All rights reserved. RPC System Utilization & Saturation "name" : "Hadoop:service=HBase,name=RegionServer,sub=IPC", "ProcessCallTime_num_ops" : 10961, "QueueCallTime_num_ops" : 10961, Cloudera Manager chart: select ipc_process_rate, ipc_queue_rate where roleType = REGIONSERVER Raw metrics • Number of processed/queued requests • If queued > processed, time to check thread dump
  15. 15. 15© Cloudera, Inc. All rights reserved. RPC System Utilization & Saturation • Observability Improvements • In case of slowness on scan.next() call, the target region name was unknown in the past. • HBASE-16972 improved the logging by adding ‘scandetails’.2019-03-20 19:33:11,982 WARN org.apache.hadoop.hbase.ipc.RpcServer: (responseTooSlow): {"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)","startt imems":1553110361981,"responsesize":63,"method":"Scan","param":"scanner_id: 2068237026033076679 number_of_rows: 100 close_scanner: false next_call_seq: 2 client_handles_partials: true client_handles_heartbeats: tru<TRUNCATED>","processingtimems":30000,"client":"10.1.1.6:34690", "queuetimems":0,"class":"HRegionServer"} 2019-03-20 19:33:11,982 WARN org.apache.hadoop.hbase.ipc.RpcServer: (responseTooSlow): {"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)","startt imems":1553110361981,"responsesize":63,"method":"Scan","param":"scanner_id: 2068237026033076679 number_of_rows: 100 close_scanner: false next_call_seq: 2 client_handles_partials: true client_handles_heartbeats: tru<TRUNCATED>","processingtimems":30000,"client":"10.1.1.6:34690", "queuetimems":0,"class":"HRegionServer","scandetails":"table: cluster_test region: cluster_test,19999998,1557654024101.db9b3c6211849f53e8857e55279b8d12."}
  16. 16. 16© Cloudera, Inc. All rights reserved. RPC System (Handlers / Queues) HDFS Client MemStoreBlockCache
  17. 17. 17© Cloudera, Inc. All rights reserved. RegionServer webui Memstore Utilization & Saturation Raw metrics "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server", "memStoreSize" : 5372418924, "name" : "Hadoop:service=HBase,name=RegionServer,sub=Regions", "Namespace_default_table_cluster_test_region_7cdc92fd59a4f1a96b431552d952560c_metric_memStoreSize" : 18295903, "Namespace_default_table_dice2_region_155bf45f338288ae19cc0e3841a5d013_metric_memStoreSize" : 0, "Namespace_default_table_cluster_test_region_d5349e089ff8129faa1e35dee2957e27_metric_memStoreSize" : 4642160, • Memstore size
  18. 18. 18© Cloudera, Inc. All rights reserved. Cloudera Manager chart: select memstore_size where category = HREGION Memstore Utilization & Saturation Cloudera Manager chart: select total_memstore_size_across_hregions where roleType = REGIONSERVER Compare the total memsore size across RegionServers Compare across regions in size in a RegionServer
  19. 19. 19© Cloudera, Inc. All rights reserved. Memstore Utilization & Saturation • Log snippet where a flush finishes • Frequency of flush (per hour) 2019-04-13 01:28:56,376 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished flush of dataSize ~105.70 MB/110836931, heapSize ~105.85 MB/110989816, currentSize=2.94 MB/3084019 for 3db6134cedc326474801068c3cb4f2a9 in 1625ms, sequenceid=4255, compaction requested=true Cell’s data alone, key bytes and value bytes, that is going to be flushed. This can be allocated off-heap too. Cell’s data on-heap along with its metadata and index (overhead of Java objects) Cell’s data alone on-heap after the flushEncoded region name How long did the flush take to complete? # grep "Finished flush of" <rs_log> | grep -o "^2019-..-.. .." | uniq -c 81 2019-05-13 17 6 2019-05-13 18 113 2019-05-15 02 18 2019-05-15 04 27 2019-05-15 12 133 2019-05-15 19 5 2019-05-15 20 198 2019-05-15 22 91 2019-05-15 23
  20. 20. 20© Cloudera, Inc. All rights reserved. Memstore Utilization & Saturation 2019-05-13 17:12:08,001 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Blocking updates: global memstore heapsize 403.0 M is >= blocking 403.0 M 2019-05-13 17:12:10,809 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Memstore is above high water mark and block 2808ms 2019-05-13 17:12:10,809 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Unblocking updates for server host-10-17-101-197.coe.cloudera.com,22101,1557773899580 • Indication of blocked updates due to high memstore utilization • Global memstore > hbase.regionserver.global.memstore.size • A memstore > hbase.hregion.memstore.block.multiplier * hbase.hregion.memstore.flush.size Why were updates blocked? How long was it blocked? Blocking updates finished 19/05/20 07:39:22 INFO client.RpcRetryingCallerImpl: Call exception, tries=7, retries=11, started=8164 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.RegionTooBusyException: Over memstore limit=128.0M, regionName=d5860b5e1a35025b6aab68dff4d944aa, server=host-10-17-101- 198.coe.cloudera.com,22101,1558363100074
  21. 21. 21© Cloudera, Inc. All rights reserved. RPC System (Handlers / Queues) HDFS Client MemStoreBlockCache
  22. 22. 22© Cloudera, Inc. All rights reserved. Blockcache Utilization & Saturation • Current block cache usage • Cache eviction "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server", "blockCacheSize" : 406847872, "blockCacheFreeSize" : 6291459, "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server", "blockCacheEvictionCount" : 38257, Raw metrics RegionServer webui
  23. 23. 23© Cloudera, Inc. All rights reserved. Cloudera Manager chart: select block_cache_free_size where roleType = REGIONSERVER Blockcache Utilization & Saturation Cloudera Manager chart: select block_cache_evicted_rate where roleType = REGIONSERVER Compare the free size across RegionServers Compare the evicted blocks ratio across RegionServers
  24. 24. 24© Cloudera, Inc. All rights reserved. RPC System (Handlers / Queues) HDFS Client MemStoreBlockCache
  25. 25. 25© Cloudera, Inc. All rights reserved. HDFS Client Utilization & Saturation "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server", "flushQueueLength" : 0, RegionServer webui Raw metrics Cloudera Manager chart: select flush_queue_size where roleType = REGIONSERVER • Flush queue size
  26. 26. © Cloudera, Inc. All rights reserved. htop – Real-Time Monitoring Tool for HBase
  27. 27. 27 © Cloudera, Inc. All rights reserved. • HBASE-11062 htop • Work in Progress! • Unix top-like tool • Real-time monitoring for hbase metrics htop overview
  28. 28. 28 © Cloudera, Inc. All rights reserved. • HBase UIs • The metrics of the moment • Can't see the metrics in time series • Ganglia/OpenTSDB/Cloudera Manager/Ambari Metrics (via Grafana) • The metrics in time series • Collecting the latest metrics takes a little bit time • htop • Real-time monitoring • A lot of features for real-time monitoring htop motivation
  29. 29. 29 © Cloudera, Inc. All rights reserved. htop motivation HBase UI Ganglia/OpenTSDB/ Cloudera Manager/ Ambari Metrics htop Metrics of the Moment ○ △ ○ Metrics in Time Series ☓ ○ ☓ Real-Time Monitoring △ △ ○
  30. 30. 30 © Cloudera, Inc. All rights reserved. htop features htop screen • Command to start htop: • $ hbase top • Similar to Unix top command • The metrics are refreshed in a certain period – 3 seconds by default • Vertical and Horizontal scrolling
  31. 31. 31 © Cloudera, Inc. All rights reserved. htop features htop screen • Demo (https://asciinema.org/a/247434)
  32. 32. 32 © Cloudera, Inc. All rights reserved. • Press d key and put a new refresh delay • We can also change the default refresh delay by specifying a command line argument: • ex) $ hbase top -delay 2 # the default refresh delay is 2 seconds htop features Change refresh delay
  33. 33. 33 © Cloudera, Inc. All rights reserved. • Demo (https://asciinema.org/a/247447) htop features Change refresh delay
  34. 34. 34 © Cloudera, Inc. All rights reserved. • Press m key and choose mode • Namespace mode • metrics per Namespace • Table mode • metrics per Table • RegionServer mode • metrics per RegionServer • Region mode (default) • metrics per Region • We can also change the default mode by specifying a command line argument: • ex) $ hbase top -mode n # the default mode is Namespace mode htop features Metrics per Namespace/Table/RegionServer/Region
  35. 35. 35 © Cloudera, Inc. All rights reserved. • Demo (https://asciinema.org/a/247177) htop features Metrics per Namespace/Table/RegionServer/Region
  36. 36. 36 © Cloudera, Inc. All rights reserved. • Press f key and choose displayed fields (by pressing space key) • We can also change the order of the fields in the same screen • Right key selects for move then Left key or Enter key comments htop features Choose displayed fields and change the order of fields
  37. 37. 37 © Cloudera, Inc. All rights reserved. • Demo (https://asciinema.org/a/247306) htop features Choose displayed fields and change the order of fields
  38. 38. 38 © Cloudera, Inc. All rights reserved. • Press f key and choose a sort field (by pressing s key) • Switch to the descending/ascending order by pressing R key • Demo (https://asciinema.org/a/247180) htop features Sort the metrics by the field values
  39. 39. 39 © Cloudera, Inc. All rights reserved. • ex) NAMESPACE==default, REQ/S>1000 • Operators: = (only needs a partial match), == (needs a exact match), >, >=, <, <=, ! • o key: Add a filter with ignore case • O key: Add a filter with case sensitive • ctrl + o key: Show current filters • = key: Clear current filters htop features Filter with the field values
  40. 40. 40 © Cloudera, Inc. All rights reserved. • Demo (https://asciinema.org/a/247181) htop features Filter with the field values
  41. 41. 41 © Cloudera, Inc. All rights reserved. • Namespace -> Tables • Table -> Regions • RegionServer -> Regions • Select a record (Namespace, Table or RegionServer) you want to drill down and Press i key htop features Drill down
  42. 42. 42 © Cloudera, Inc. All rights reserved. • Demo (https://asciinema.org/a/247182) htop features Drill down
  43. 43. 43 © Cloudera, Inc. All rights reserved. • htop gets the metrics from ClusterMetrics from Admin.getClusterMetrics() • It needs to access only HBase Master • If we add more metrics, we first need to add them to ClusterMetrics • The metrics from JMX endpoints will give more metrics but it needs to access all RegionServers, which might cause scalability issues htop internals
  44. 44. 44 © Cloudera, Inc. All rights reserved. • Not committed yet and a work in progress • Building htop for HBase 2.x • The basic features have been implemented • The remaining tasks for htop • Some code refactoring • Adding some tests • Documentation Current status of htop
  45. 45. 45 © Cloudera, Inc. All rights reserved. • Support branch-1 • Add more metrics so that we can see more information from htop • Response time metrics ASAP • The metrics per Column Family/User/Operation (GET, PUT, SCAN, etc.) • System information like CPU usage and memory usage might be useful • Useful features in Unix top command • Color mapping • Batch mode, etc. htop in the future
  46. 46. THANK YOU
  47. 47. 47 © Cloudera, Inc. All rights reserved. Q & A

Editor's Notes

  • First of all, let us introduce ourselves.

    My name is Daisuke Kobayashi. My team mates call me just Dice, or DiceK as a nickname. I have been working at Cloudera based in Japan since 2012. I’m actually working as backline support now to help customers and also internal support folks to resolve complicated issues. I’m also an HBase contributor.

    Hello, my name is Toshihiro Suzuki.
    I’m an HBase committer since last year.
    And I’m a Sr. Software Engineer, Breakfix in the Support team at Cloudera. I mainly handle HBase/Phoenix and HDFS cases.
    I have written and published a book based on HBase for beginners in Japanese.
  • So what does supporting HBase mean by at Cloudera? At cloudera, we have a big HBase user base and the number of nodes is quite widespread, from 10 nodes to 100, and 1000 nodes. They report various types of issues to our support team every single day and our job is simple. Just fix the issue and answer their questions. If I could summarize the problems reported by customers, these are typical scenarios we usually see. Fixing performance degradation, identifying the reason of process being crashed, and also fixing inconsistencies which is well known issue either in HBase 1 and in 2. But in this talk, we will specifically focus on the first one.

  • From my side, I‘m gonna introduce the general approach to performance issues and will show existing tools we usually use in the context of HBase troubleshooting. Later on, from my colleague Toshi, he will be talking about a new tool he’s now developing. It’s more intuitive and efficient for troubleshooting in real time.
  • So, fixing performance issues is tough. This is because the number of nodes is different across customers, they definitely run different versions with different configurations, different types of datasets and diffrent use cases. They are all different.

    Various types of factors can lead to performance issues. Something like misconfigurations on HBase, unbalanced loads on regionservers, which is as known as hot spot, because of bad schema designs. Also all regionservers shoud be collocated with datanodes and if the particular region’s block doesn’t exist in the local datanode, it has to read the data remotely over another datanodes. Apart from that, there might be bad OS configuration, GC issues, hardware failures or network related issues.

    Another thing which makes it difficult to troubleshoot these issues is there are various information exposed through logs and metrics regarding how the HBase cluster performs. Whenever we analyze problems, we have to pick up right log snippets and metrics to correlate to the root cause. In order to take advantages from the logs and metrics, it is obvious that we need to understand what they actually mean, why they are logged? and also when a particular metric is incremented? It's also important to understand what they are not.

    For core HBase developers, these questions may be easy to answer, but HBase is widespread and used by many users at various types of industry. Over last couple years, I have been asked about the meaning of given metrics and log snippets over and over. So the aim of my talk is to share these basic information with others to help them to be able to narrow down the problems and dig into further.
  • So, to start performance troubleshooting, I think these are the typical and important approach. First off, we need to listen to customers in order to understand what they are complaining and what they are hitting, and also what they wanna resolve. This is the very first and important step to be on the same page with them, In order to narrow down performance issues, in general we should look at the system with top - down approarch. Specifically in HBase, we fist look at the cluster itself and see how resource usages are distributed across nodes. If something looks going wrong on a particular nodes, we need to dig into the node. All though the troubleshooting step, I like using the USE method, which is originally defined by Brendan Gregg at Netflix and ex-Sun guy.

    The USE method is designed like an emergency checklist in a flight manual. So it’s intended to be simple, straightforward, complete, and fast. USE stands for Utilization, Saturation, and Errors. Utilization carries a question how busy is the particular resource? Saturation can be measured as the length of a wait queue, or time spent waiting on the queue. The Errors are explicit indications of something going wrong. It is obvious the USE method is not perfect, but it can be used as the very first checklist to identify the bottolneck quickly as possible.

    So, the next question is what are the resources in HBase. You know RegionServer is the worker role and responsible for processing read and write requests
  • These are the typical resources in a single regionserver.
  • All user requests are coming into the rpc system first, they are queued and processed by handlers concurrently. For caching it goes to the memstore for write or block cache for read. The data is persisted to HDFS at some conditions. As you know the requests always go with the direction of the orange arrow. Which means we should always follow this way when checking resources.
  • So what typs of informations are exposed by each resource? For example at the rpc system, it exposes the number of requests, how many requests getting queued and processed. For memstore, it exposes the memstore size, what’s the size of flushed memstore, and also the frequency of flush. So, using these observability items, we can check how the resource is utlilized and saturated. From the next slides, let’s walk through each resource one by one
  • First, the RPC system
  • From this slide, I’m gonna show you the metrics, webui, and also logs that’s used for troubleshooting. Please note that all those are aligned to HBase 2.1 code base, more specifically CDH 6.2. As I mentioned, the RPC system is the place where all client requests arrive. So, we should be able to check how many number of requests are received by every single regionserver. Here in the gray area, I’m showing the raw metric that is exposed via JMX endpoint on a paritcular regionserver. The total request count is also exposed through the Master and regionserver webui. We can just simply compare the requests across regionservers. If there’s an outstanding value, it’s a chance to narrow down to the particular regionserver.

    If you have been managing HBase and familiar with these webuis, you may be aware that the columns in the table are sortable. This is a simple but powerful change. We often have a screen sharing session with a customer to see the issue in a real time fashion. Every time we look at these webui, it was difficult to figure out the highest or the lowest servers without doing something tricky stuff. So this sorting functionality should make our life easier.

    This number is incremented by various types of request call at the RPC server level as describing in the slide.
  • Next, to understand the saturation, the number of requests being queued at a particular point in time is exposed. That is what I’m showing in the gray area as raw metrics and the corresponding values in the webui below. As meta table is usually accessed frequently than others, it’s isolated from the queue for normal regions. If the queue size is constantly growing, it may be indicating something going wrong in processing the requests.
  • We can check how many requests are processed and queued so far by the RPC system. I’m showing the raw metric value in the gray area. Since it’s just an incremental value, Cloudera Manager converts this value into rate, which make it easy to understand how things are going over time. Ideally, both processed and queued should be same. The processed is the blue graph and the queued is the green one in this example. We can see both exactly matches since as things are going well. If the queued becomes bigger than processed, it’s the sign of RPC handlers getting slow with some reason. We should check the thread dump to dig into further
  • If the RPC system takes longer than 10 seconds to respond back for a given request, it informs the table and the region name in the process logs. However, in case of scan next call is slow, none of the target region name or row key was informed so we were really frustrated while troubleshooting. Fortunately, recent version gets this improved by logging the scan details as I'm showing with green makrer in the second example. With this hint, we should be able to narrow down to the particular region to see why it’s slow.
  • Alright, next let’s take a look at memstore.
  • Memstore utilization is exposed via several levels, from server, tables, and regions. Here I'm showing the server and the region level raw metrics along with the corresponding webui. I think it’s fairly easy to understand the memstore utilization
  • When using Cloudera Manager, we typically use this sort of queries to compare the total memstore utilization across regionservers. The above graph is indicating it. Also we can check if there’s any outstanding region which utilizes memstore than other regions in a single regionserver, which is in the below graph.
  • Flush persists data in memstore into the underlying HDFS, which means the memstore is fully utilized, or most likely saturated. This is an example of log snippet where a flush finishes. In HBase 2 data can be allocated off-heap for both read and write. Given this, the log informs the pure key-value data size and the on-heap occupation separately. It’s also showing how long does it take to flush. These numbers should be informative to see how a particular flush goes. If it takes longer, it may be time to look at the HDFS performance too.

    Using this granular logging of flush, we can see the frequency of flush activity on a regionserver. In this example, I'm grouping the output on an hourly basis.
  • If the total memstore size across regions in a single regionserver goes beyond the limit of global memstore size, all updates are blocked by the regionserver until the utilization gets decreased less than the threshold. This is a typical log message in HBase 2.1.

    There are three lines where each correlates. The first line indicates blocking updates started because the global memstore size becomes greater than blocking threashold. The second line shows how long it took, and the third line indicates blocking completed.

    In the second example, the client gets the RegionTooBusyException for the particular region. This is because this region has too big memstore in size which is not flushed yet. This is also a typical indication of saturation regarding the specific memstore.
  • In the context of block cache, utilization is a simple cache usage which is available via raw metrics and also via webui. If a cache is evicted, in general, it means it’s saturated. I’m showing the raw metrics on the left hand side and the corresponding webui informations on the right hand side. From the top, it’s indicating how much the block cache resource is used and what’s the remaining memory for cache, and the number of evicted blocks.
  • Using Cloudera Manager, we can check the eviction rate, which is converted from the raw metric value. I’m showing an example in the graph below. If the utilization is higher enough, but the eviction rate is also higher, it’s the sign of block cache size is too small to handle the current workload appropriately. So it's time to think about increasing the cache size.
  • Alright, I’m gonna quickly cover the last resource in the picture. The HDFS resource utilization and saturation are basically tracked at the HDFS level metrics and logs. So I can't talk much in this session, but I am gonna show one related metric exposed at the HBase level.
  • That’s flush queue size. When flusing memstore, it’s queued first and persisted to HDFS later. The queue is maintained at the regionserver level and exposed as a metric through webui. It’s visible through Cloudera Manager chart as well. Typically, its utilization shouldn’t be grown, so if the queue is constantly growing it’s denoting flush is failing or getting slow with some reason. So it's time to look at the HDFS size.

    That’s pretty much all I have prepared for this presentation. Alright, I have been talking about how to look at the resources in Hbase and their utilization and saturation mainly from metrics and sometimes from logs. I’m pretty sure that I couldn’t cover everything. We have to look further using different approach if we couldn’t find anything bad with this approach, but I wish you could find an idea from my talk. From Toshi, he’s gonna give a presentation about a new tool which should make our life better.
  • From my side, I’m going to talk about htop that’s a Real-Time Monitoring Tool for HBase.
  • So, overview of htop.
    htop is the tool I’m developing now, which is raised in the JIRA ticket, HBASE-11062.
    This is an Unix top-like tool, and we can do real-time monitoring for the hbase metrics with it.
  • And, the motivation of htop.
    As Dice mentioned, a first approach when we are facing performance issues is to check the current status of the cluster.
    At this time, we can see HBase UIs to check the metrics. And it shows the metrics of the moment, but we can't see them in time series from it.
    If you want to see the metrics in time series, we have Ganglia, OpenTSDB, Cloudera Manager and Ambari Metrics. In Ambari metrics, we can see the metrics via Grafana. They are useful when we want to see the metrics in time series, but if you're going to do real-time monitoring, they are not very useful because collecting the latest metrics takes a little bit time in those tools.
    For real-time monitoring, I have started to develop htop.
    I’ll explain the features of htop later in this talk.
  • To clarify the position of htop, I made this matrix of the features of those tools.
    If you just want to see the metrics of the moment, you can use any tool of them.
    However, in Ganglia, OpenTSDB, Cloudera Manager and Ambari Metrics, collecting the latest metrics takes a little bit time.
    If you want to see the metrics in time series, you need to use Ganglia, OpenTSDB, Cloudera Manager or Ambari Metrics.
    And If you want to do real-time monitoring, htop is the most useful of them as it has a lot of features to do that.
  • From here, I will talk about the features of htop with demonstrations.

    Firstly, about htop screen.
    We can start htop by running hbase top command.
    The UI is similar to Unix top command.
    The metrics are refreshed in a certain period – 3 seconds by default
    And you can do vertical and horizontal scrolling.
  • I’ll show you demo of htop screen.
    Actually, this is not a live demo, but a terminal recording.
    And we can see this demo anytime in this URL.

    To start htop, run hbase top command.
    This is the screen of htop.
    The metrics in this screen are refreshed per 3 seconds.
    It consists of 2 parts, Summary part and Metrics part.
    In Summary part, you can see the HBase version, cluster ID, the number of region servers, the region count, Average Cluster Load and aggregated Request count per second.
    In Metrics part, you can see the metrics. In this case, you can see the metrics per region and it shows naamesapce name, table name, encoded region name, RegionServer name, Request count per second, read request count per second and so on.
    You can scroll down to see all metrics like this. you can also do horizontal scrolling like this.
  • As mentioned, the refresh delay is 3 seconds by default.
    But you can change it by pressing ‘d’ key and put the new refresh delay.
    And we can also change the default refresh delay by specifying a command line argument “-delay”
  • I’ll show you the demo of it.

    If you press ‘d’ key in htop screen, you can put a new refresh delay.
    In this demo, trying to change it to 1 seconds.
    Yeah, it has been changed.
  • And next.
    Currently, htop can show the metrics per Namespace, Table, RegionServer and Region.
    And they are called respectively Namespace mode, Table mode, RegionServer mode and Region mode.
    The default is region mode.
    We can change this mode by pressing ‘m’ key in htop screen.
    And we can also change the default mode by specifying a command line argument “-mode”
  • So, I’ll show you demo of it.

    Now, you see the metrics per region, and we can change it to Namespace or Table or RegionServer by pressing ‘m’ key.
    For example, we can see the metrics per Namespace like this or you can also see the metrics per Table like this.
  • In addition to that, we can choose which fields are displayed in the screen.
    By pressing ‘f’ key, you can choose displayed fields.
    We can also change the order of fields in the same screen.
  • I’ll show you the demo of it.

    By pressing ‘f’ key, move to this screen where you can choose displayed fields.
    For now, in region mode, these fields here can be displayed.
    And For example, if you don’t need Namespace and Table fields, and if you need Region name field, then you can remove and add these fields like this.
    And as you can see, the fields are removed and added.

    Also, we can change the order of fields in the same screen.
    Go back to the screen by pressing ’f’ key,
    and select the field you want to move and press Right key.
    And then move the field to anywhere you want to move it and press Left key.
    So you can see the order of the fields is changed.
  • It’s also possible to sort the metrics by the field values.
    And we can switch to descending or ascending order by pressing ‘R’ key.
    I’ll show you demo of it.

    Press ‘f’ key to move to the previous screen. And you can also choose a sort field on the same screen.
    If you want to sort the metrics by “Request count per second,”
    choose the field and press ‘s’ key.
    So the current sort field is changed to “Request count per second”
    And then you can see the metrics sorted by the field.
  • So next is Filter feature that’s very important.
    For example, if you want to see the metrics of “default” Namespace only, you can specify this filter NAMESPACE==default.
    Or if you want to see the metrics that have more then 1000 requests per second, then you can specify a filter like this REQ/S>1000
    In this Filter feature, we can use the general operators like those:

    When we press o key in the htop screen, we can add a filter with ignore case.
    When we press O key, we can add a filter with case sensitive.
    Also, when we press ctrl + o key, we can see the current filters.
    And, when we press = key, we can clear the current filters.
  • Let me show you demo of it.

    If you want to see the metrics in “default” namespace only, press ’o’ key and you can specify a filter like this.
    As you can see, only the metrics in “default” Namespace are shown now.
    And, if you want to see the metrics of the ”test” table only, press ’o’ key again and you can add a filter like this.
    So now only the metrics in “default” Namespace and “test” table are shown.
    Furthermore, if you want the metrics that have more than 1000 requests, then you can add a filter like this.
    So, we can see only the metrics more than 1000 requests.

    We can see the specified filters by pressing ctrl + ‘o‘ key like this.
    These are the current filters.

    We can clear the current filters by pressing ‘=’ key like this.
    The filters are cleared.
  • The last feature I’d like to introduce here is the drill-down feature.
    We can drill down from Namespace to Tables, from Table to Regions, or from RegionServer to Regions.
    With this feature, we can find the “Hot Spot” region easily.
    We can drill down by selecting a record you want to drill down and pressing i key.
  • I’ll show you demo of it.

    If you want to drill down the “default” namespace to the tables,
    you can move to the namespace mode
    and select the “default” namespace and then press ‘i’ key.
    So you can see the metrics for the tables in the “default” namespace.

    Furthermore, if you want to drill down from the “test” table to the regions,
    select “test” table and press ‘i’ key,
    so you can see the metrics for the regions of the “test” table.

    Similarly, you can drill down from a RegionServer to regions.
    Move to the RegionServer mode and select one of the RegionServers and press ‘i’ key.
    So you can see the metrics for the regions on the selected RegionServer.

    That’s it for the demonstrations of the features of htop.
  • Next, let me talk about the internals of htop.
    Currently, htop gets the metrics from ClusterMetrics class from Admin.getCusterMetrics method because that needs to access only HBase Master to do that.
    So if we add more metrics to htop, we first need to add more metrics to ClusterMetrics class.
    Actually, the metrics from JMX endpoints will give more metrics to us, but it needs to access all RegionServers, which might cause scalability issues.
    So I decided not to use JMX endpoints for htop.
  • In this slide, I’ll talk about the current status of htop.
    As mentioned, htop hasn’t been committed yet, and it’s a work in progress actually.
    However, the basic features have been implemented as I showed you in the demonstrations.
    The remaining tasks for it are some code refactoring and adding some tests. I also need to make documentation for it.
    Maybe, it will be ready for review next month, and once the review is passed, it will be committed.
  • And, htop in the future.
    Currently, I’m developing this tool for the master branch and branch-2. So as a next step, we need to support branch-1.
    And we should add more metrics so that we can see more information from htop.
    Especially, adding response time metrics is required because they are very important for performance troubleshooting.
    And we can add the metrics per Column Family, User and Operation like GET, PUT, SCAN.
    And I’m thinking about adding system information like CPU usage and memory usage, which might be useful.
    In addition to that, we can add the useful features in Unix top command like Color mappings or Batch mode.
  • That’s all from my side. We hope this presentation was informative for you. Thank you very much.
  • We have a few minutes for Q & A. Any Questions?

×