SlideShare a Scribd company logo
HBase Metrics: What they mean to you

             David S. Wang
           dsw@cloudera.com
               5/22/12

                     HBaseCon 2012. 5/22/12
          Copyright 2012 Cloudera Inc. All rights reserved
Agenda

•   Motivation
•   Where to get metrics
•   Collection methods
•   Metrics:
    •   HBase Operations
    •   Memory-related
    •   Latency-related
    •   Host-level
• Takeaways

                                 HBaseCon 2012. 5/22/12                  2
                      Copyright 2012 Cloudera Inc. All rights reserved
Motivation

• To understand the steady-state
  behavior of your HBase cluster
• To debug when something bad
  happens
• To help evaluate changes
• To figure out when you need
  to buy more hardware




                             HBaseCon 2012. 5/22/12                  3
                  Copyright 2012 Cloudera Inc. All rights reserved
Where to get metrics
       JVM                                                JVM
            Region server
                                                                  Master
               Region
                Region

                                                         JVM

                                                                 HDFS
    Metrics can be obtained                                    DataNode
    from all of these areas

     Host
                CPUs          Memory                   Disks                Network



                                    HBaseCon 2012. 5/22/12                            4
                         Copyright 2012 Cloudera Inc. All rights reserved
Collection methods

                                          •     Dump to file
                                          •     web UI /jmx
                                          •     jconsole
                                          •     Ganglia
                                          •     Cloudera Manager
                                          •     Many other tools out
                                                there – suggestions
                                                welcome


                         HBaseCon 2012. 5/22/12                        5
              Copyright 2012 Cloudera Inc. All rights reserved
HBase Operations: compactions




  • hbase.regionserver.compactionQueueSize – size of queue of compactions
    waiting to be processed
  • h.r.compactionSize_avg_time – average bytes processed per compaction
  • h.r.compactionTime_avg_time – average time per compaction in
    milliseconds
  • h.r.compactionSize_num_ops, h.r.compactionTime_num_ops – total
    number of compactions so far
  • TIP: Spikes in h.r.compactionQueueSize could mean all of your regions
    are growing at the same rate and need to split/compact at around the
    same time – time to presplit your regions or turn off auto-compactions
                                 HBaseCon 2012. 5/22/12                  6
                      Copyright 2012 Cloudera Inc. All rights reserved
HBase Operations: flushes




  • h.r.flushQueueSize – size of queue of flushes waiting to be
    processed
  • h.r.flushSize_avg_time – average bytes processed per flush
  • h.r.flushTime_avg_time – average time per flush in milliseconds
  • h.r.flushSize_num_ops, h.r.flushTime_num_ops – total number of
    flushes so far
  • TIP: Small h.r.flushSize_avg_time and large flushQueueSize mean
    premature flushes: may indicate that you need more RAM, or you
    are ingesting and flushing faster than your disks can handle
                               HBaseCon 2012. 5/22/12                  7
                    Copyright 2012 Cloudera Inc. All rights reserved
HBase Operations: region splits




  • h.r.regionSplitFailureCount – number of unsuccessful splits
  • h.r.regionSplitSuccessCount – number of successful splits
  • TIP: Spikes in h.r.regionSplitSuccessCount may mean that you are
    spending too much time splitting – think about pre-splitting your
    regions
  • TIP: Sustained high rates of h.r.regionSplitSuccessCount may mean
    that you are ingesting fast enough to exceed your region size
    configuration – change the configuration and/or presplit/disable
    automatic splits
                                HBaseCon 2012. 5/22/12                  8
                     Copyright 2012 Cloudera Inc. All rights reserved
HBase Operations: stores and storefiles




  • h.r.stores – total number of stores
  • h.r.storefiles – total number of storefiles
  • TIP: Large ratio between number of h.r.stores and
    h.r.storefiles may indicate it’s time to compact more
    frequently – otherwise read performance can suffer


                                HBaseCon 2012. 5/22/12                  9
                     Copyright 2012 Cloudera Inc. All rights reserved
Memory: memstores




  • h.r.memstoreSizeMB – sum of all the memstore sizes in megabytes
  • h.r.numPutsWithoutWAL – number of puts with
    setWriteToWAL(false)
  • h.r.mbInMemoryWithoutWAL – amount of data from puts with
    setWriteToWAL(false) in megabytes
  • TIP: Anytime you have non-zero values in h.r.numPutsWithoutWAL
    and h.r.mbInMemoryWithoutWAL, you risk data loss if the RS
    crashes, until the next flush when they go back down to zero.
    Useful to detect applications that setWriteToWAL(false).

                               HBaseCon 2012. 5/22/12                  10
                    Copyright 2012 Cloudera Inc. All rights reserved
Memory: block cache




  • h.r.blockCacheCount - number of blocks in block cache
  • h.r.blockCacheEvictedCount – number of blocks evicted in the last period
  • h.r.blockCacheFree, h.r.blockCacheSize – free/occupied space in block
    cache in bytes
  • TIP: High sustained levels of h.r.blockCacheEvictedCount mean your block
    cache is turning over due to heap size constraints – possibly more GCs
      • Get more memory
      • Repartition heap amongst the various processes on the host – more for RS


                                    HBaseCon 2012. 5/22/12                         11
                         Copyright 2012 Cloudera Inc. All rights reserved
Memory: block cache




  • h.r.blockCacheHitCount, h.r.blockCacheMissCount – number of cache
    hits/misses
  • h.r.blockCacheHitRatio/h.r.blockCacheHitRatioPastNPeriods – hit ratio (cache
    hits/total requests) for past period, past N periods
  • h.r.blockCacheHitCachingRatio/h.r.blockCacheHitCachingRatioPastNPeriods
    – hit caching ratio (cache hits from requests set to use block cache/total
    requests set to use block cache) for past period, past N periods
  • TIP: With sustained traffic, in general you want your block cache to be fully
    utilized for maximum performance – high h.r.blockCacheHitRatio and
    h.r.blockCacheHitCachingRatio

                                   HBaseCon 2012. 5/22/12                      12
                        Copyright 2012 Cloudera Inc. All rights reserved
Memory: JVM-specific metrics




  • GC
     • TIP: Correlate with “concurrent mode failure” or “promotion failed” in the
        GC logs for GC pauses
  • Heap usage
     • TIP: Do not want this to max out – add more memory or assign more heap
        to your RS
  • Number of logs categorized by level
     • TIP: Be careful of any WARN or ERROR logs – helpful first indicator to look
        more deeply in logs
  • Threads blocked/running

                                   HBaseCon 2012. 5/22/12                        13
                        Copyright 2012 Cloudera Inc. All rights reserved
Latency: FS read




  • h.r.fsReadLatency_avg_time – average time per sequential read in ms
  • h.r.fsReadLatency_num_ops – number of sequential reads
  • h.r.fsPreadLatency_avg_time – average time per positional read in ms
  • h.r.fsPreadLatency_num_ops – number of positional reads
  • Various histograms based on this – 75th, 95th percentile, mean, standard
    deviation, etc., in nanoseconds
  • TIP: Spikes in h.r.fsReadLatency_avg_time/h.r.fsPreadLatency_avg_time
    can indicate HDFS/disk/network problems
                                  HBaseCon 2012. 5/22/12                       14
                       Copyright 2012 Cloudera Inc. All rights reserved
Latency: FS write/sync
  • h.r.fsWriteLatency_avg_time – average time per HLog edit write in ms
  • h.r.fsWriteSize_avg_time – average size per HLog edit write in bytes
  • h.r.fsWriteLatency_num_ops, h.r.fsWriteSize_num_ops – number of HLog
    edit writes
  • h.r.fsSyncLatency_avg_time – average time to sync HLogs to the filesystem
    in ms
  • h.r.fsSyncLatency_num_ops – number of HLog syncs
  • h.r.slowHLogAppendCount – number of HLog edits that took longer than 1
    second
      TIP: Search for a WARN-level log message with the text “appending an edit to hlog”;
      use that time to spelunk through logs to see what else is going on
  • Various histograms based on this – 75th, 95th percentile, mean, standard
    deviation, etc., in nanoseconds
  • TIP: Spikes in write or sync latencies can also be due to HDFS/bad
    disks/bad network

                                     HBaseCon 2012. 5/22/12                             15
                          Copyright 2012 Cloudera Inc. All rights reserved
Latency, region: per-operation-type, per-region
  • From regionserver metrics: h.r.*RequestLatency - latency histograms for
    various client operations in nanoseconds
  • Also general RPC metrics for all requests from clients
      • rpc.metrics.RPCQueueTime is time it takes from receipt of the RPC until it starts
        being processed
      • rpc.metrics.RPCProcessingTime is time it takes from start to end of processing
      • TIP: These are some of the first places to look if your client seems to be running
        slow
  • h.r.readRequestsCount – requests for gets, scanners. Not # of rows.
  • h.r.writeRequestsCount – requests for puts, deletes, etc. Not # of rows.
  • hbase.RegionServerDynamicStatistics.* contains some metrics on a per-
    region and per-column-family basis
      • Subset of block cache, compaction, flush, store/storefile sizes, etc.
  • TIP: These metrics can be used to answer the questions:
      • Is the problem only in one region or in all regions?
      • Is the problem only with one operation type or all operations?


                                      HBaseCon 2012. 5/22/12                                 16
                           Copyright 2012 Cloudera Inc. All rights reserved
Host-level metrics
  • CPU:
     •   Load averages
     •   % idle/system/user/wio (waiting for block I/O)
     •   Running/total processes
     •   Correlate with “top”, “sar”, “mpstat” to figure out which process is
         doing what.
           • TIP: If a process is eating a lot of CPU, that is a clue to look at its logs
  • Memory:
     • swap_free/swap_total
     • cached/free/shared
     • TIP: Compare with logs, other metrics to help determine causes of
         OOMs


                                      HBaseCon 2012. 5/22/12                                17
                           Copyright 2012 Cloudera Inc. All rights reserved
Host-level metrics
  • Network:
     • Bytes/packets sent/received
     • TIP: Correlate with ZK timeouts from logs if RSes are going down
     • TIP: Also look at Ethernet frame errors from ifconfig if you are
       experiencing unexpected dips in network traffic
  • Disk I/O:
     • Read/write latencies
         • Slow disks make HBase performance very bad, especially on .META.
         • TIP: Check dmesg for SCSI errors
     • disk space available
     • swap usage
         • TIP: Should be 0 – otherwise you will have timeouts and RSes going
            down. Add RAM and/or buy more boxes.
                                  HBaseCon 2012. 5/22/12                        18
                       Copyright 2012 Cloudera Inc. All rights reserved
Takeaways

• Know what you are looking for
   • Know what metrics are available, what they mean
   • Collect a lot of different metrics, but don’t drown in them
• Metrics are not all you need
   • Correlate metrics with logs, workload
   • Metrics are another tool in your toolbox. You still have to do
     the work to monitor/debug/tune.




                                HBaseCon 2012. 5/22/12                  19
                     Copyright 2012 Cloudera Inc. All rights reserved
Takeaways
                                        • Take a baseline of your
                                          system in steady-state
                                                • For later comparison if things
                                                  go bad
                                                • Try to make this as apples-to-
                                                  apples as possible, e.g. same
                                                  hardware/workload/config
                                        • Spikes or dips from
                                          baseline can indicate
                                          problems
                                                • But also depends on which
                                                  metric and if you can explain
                                                  it (e.g. increased workload)

                       HBaseCon 2012. 5/22/12                                     20
            Copyright 2012 Cloudera Inc. All rights reserved
Thank you
 (Especially to Jon Hsieh and Todd Lipcon
for their helpful reviews and suggestions)




         Cloudera is hiring
     E-mail me: dsw@cloudera.com


                   HBaseCon 2012. 5/22/12                  21
        Copyright 2012 Cloudera Inc. All rights reserved

More Related Content

What's hot

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
DataWorks Summit
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
强 王
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
Ruben Badaró
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
chrislusf
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
Jiangjie Qin
 
kafka
kafkakafka
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
CockroachDB
CockroachDBCockroachDB
CockroachDB
andrei moga
 
5 things you didn't know nginx could do
5 things you didn't know nginx could do5 things you didn't know nginx could do
5 things you didn't know nginx could do
sarahnovotny
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
Brian Brazil
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
Docker, Inc.
 
Singer, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging InfrastructureSinger, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging Infrastructure
Discover Pinterest
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
confluent
 

What's hot (20)

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
 
kafka
kafkakafka
kafka
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
CockroachDB
CockroachDBCockroachDB
CockroachDB
 
5 things you didn't know nginx could do
5 things you didn't know nginx could do5 things you didn't know nginx could do
5 things you didn't know nginx could do
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
 
Singer, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging InfrastructureSinger, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging Infrastructure
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
 

Viewers also liked

HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
Cloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
HBaseCon
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
Cloudera, Inc.
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
Cloudera, Inc.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
Cloudera, Inc.
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
HBaseCon
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
Cloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
Cloudera, Inc.
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
Cloudera, Inc.
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
Cloudera, Inc.
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
Cloudera, Inc.
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
Cloudera, Inc.
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Cloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
Cloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
Cloudera, Inc.
 

Viewers also liked (20)

HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBase
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 

Similar to HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera

Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
wchevreuil
 
Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)
jmhsieh
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseStrata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Cloudera, Inc.
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
Tomer Shiran
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
Cloudera, Inc.
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
boorad
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
Cloudera, Inc.
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
Nick Dimiduk
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptx
Manish Maheshwari
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling Impala
Manish Maheshwari
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
Ted Dunning
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
MapR Technologies
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
HBaseCon
 
Flume and HBase
Flume and HBase Flume and HBase
Flume and HBase
Alexander Alten
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
fann wu
 

Similar to HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera (20)

Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
 
Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseStrata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptx
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling Impala
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
Flume and HBase
Flume and HBase Flume and HBase
Flume and HBase
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
Priyanka Aash
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
Priyanka Aash
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
Priyanka Aash
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
WhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring AppsWhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring Apps
HackersList
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
Ivanti
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
The importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT StandardizationThe importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT Standardization
Axel Rennoch
 

Recently uploaded (20)

(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
WhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring AppsWhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring Apps
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
The importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT StandardizationThe importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT Standardization
 

HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera

  • 1. HBase Metrics: What they mean to you David S. Wang dsw@cloudera.com 5/22/12 HBaseCon 2012. 5/22/12 Copyright 2012 Cloudera Inc. All rights reserved
  • 2. Agenda • Motivation • Where to get metrics • Collection methods • Metrics: • HBase Operations • Memory-related • Latency-related • Host-level • Takeaways HBaseCon 2012. 5/22/12 2 Copyright 2012 Cloudera Inc. All rights reserved
  • 3. Motivation • To understand the steady-state behavior of your HBase cluster • To debug when something bad happens • To help evaluate changes • To figure out when you need to buy more hardware HBaseCon 2012. 5/22/12 3 Copyright 2012 Cloudera Inc. All rights reserved
  • 4. Where to get metrics JVM JVM Region server Master Region Region JVM HDFS Metrics can be obtained DataNode from all of these areas Host CPUs Memory Disks Network HBaseCon 2012. 5/22/12 4 Copyright 2012 Cloudera Inc. All rights reserved
  • 5. Collection methods • Dump to file • web UI /jmx • jconsole • Ganglia • Cloudera Manager • Many other tools out there – suggestions welcome HBaseCon 2012. 5/22/12 5 Copyright 2012 Cloudera Inc. All rights reserved
  • 6. HBase Operations: compactions • hbase.regionserver.compactionQueueSize – size of queue of compactions waiting to be processed • h.r.compactionSize_avg_time – average bytes processed per compaction • h.r.compactionTime_avg_time – average time per compaction in milliseconds • h.r.compactionSize_num_ops, h.r.compactionTime_num_ops – total number of compactions so far • TIP: Spikes in h.r.compactionQueueSize could mean all of your regions are growing at the same rate and need to split/compact at around the same time – time to presplit your regions or turn off auto-compactions HBaseCon 2012. 5/22/12 6 Copyright 2012 Cloudera Inc. All rights reserved
  • 7. HBase Operations: flushes • h.r.flushQueueSize – size of queue of flushes waiting to be processed • h.r.flushSize_avg_time – average bytes processed per flush • h.r.flushTime_avg_time – average time per flush in milliseconds • h.r.flushSize_num_ops, h.r.flushTime_num_ops – total number of flushes so far • TIP: Small h.r.flushSize_avg_time and large flushQueueSize mean premature flushes: may indicate that you need more RAM, or you are ingesting and flushing faster than your disks can handle HBaseCon 2012. 5/22/12 7 Copyright 2012 Cloudera Inc. All rights reserved
  • 8. HBase Operations: region splits • h.r.regionSplitFailureCount – number of unsuccessful splits • h.r.regionSplitSuccessCount – number of successful splits • TIP: Spikes in h.r.regionSplitSuccessCount may mean that you are spending too much time splitting – think about pre-splitting your regions • TIP: Sustained high rates of h.r.regionSplitSuccessCount may mean that you are ingesting fast enough to exceed your region size configuration – change the configuration and/or presplit/disable automatic splits HBaseCon 2012. 5/22/12 8 Copyright 2012 Cloudera Inc. All rights reserved
  • 9. HBase Operations: stores and storefiles • h.r.stores – total number of stores • h.r.storefiles – total number of storefiles • TIP: Large ratio between number of h.r.stores and h.r.storefiles may indicate it’s time to compact more frequently – otherwise read performance can suffer HBaseCon 2012. 5/22/12 9 Copyright 2012 Cloudera Inc. All rights reserved
  • 10. Memory: memstores • h.r.memstoreSizeMB – sum of all the memstore sizes in megabytes • h.r.numPutsWithoutWAL – number of puts with setWriteToWAL(false) • h.r.mbInMemoryWithoutWAL – amount of data from puts with setWriteToWAL(false) in megabytes • TIP: Anytime you have non-zero values in h.r.numPutsWithoutWAL and h.r.mbInMemoryWithoutWAL, you risk data loss if the RS crashes, until the next flush when they go back down to zero. Useful to detect applications that setWriteToWAL(false). HBaseCon 2012. 5/22/12 10 Copyright 2012 Cloudera Inc. All rights reserved
  • 11. Memory: block cache • h.r.blockCacheCount - number of blocks in block cache • h.r.blockCacheEvictedCount – number of blocks evicted in the last period • h.r.blockCacheFree, h.r.blockCacheSize – free/occupied space in block cache in bytes • TIP: High sustained levels of h.r.blockCacheEvictedCount mean your block cache is turning over due to heap size constraints – possibly more GCs • Get more memory • Repartition heap amongst the various processes on the host – more for RS HBaseCon 2012. 5/22/12 11 Copyright 2012 Cloudera Inc. All rights reserved
  • 12. Memory: block cache • h.r.blockCacheHitCount, h.r.blockCacheMissCount – number of cache hits/misses • h.r.blockCacheHitRatio/h.r.blockCacheHitRatioPastNPeriods – hit ratio (cache hits/total requests) for past period, past N periods • h.r.blockCacheHitCachingRatio/h.r.blockCacheHitCachingRatioPastNPeriods – hit caching ratio (cache hits from requests set to use block cache/total requests set to use block cache) for past period, past N periods • TIP: With sustained traffic, in general you want your block cache to be fully utilized for maximum performance – high h.r.blockCacheHitRatio and h.r.blockCacheHitCachingRatio HBaseCon 2012. 5/22/12 12 Copyright 2012 Cloudera Inc. All rights reserved
  • 13. Memory: JVM-specific metrics • GC • TIP: Correlate with “concurrent mode failure” or “promotion failed” in the GC logs for GC pauses • Heap usage • TIP: Do not want this to max out – add more memory or assign more heap to your RS • Number of logs categorized by level • TIP: Be careful of any WARN or ERROR logs – helpful first indicator to look more deeply in logs • Threads blocked/running HBaseCon 2012. 5/22/12 13 Copyright 2012 Cloudera Inc. All rights reserved
  • 14. Latency: FS read • h.r.fsReadLatency_avg_time – average time per sequential read in ms • h.r.fsReadLatency_num_ops – number of sequential reads • h.r.fsPreadLatency_avg_time – average time per positional read in ms • h.r.fsPreadLatency_num_ops – number of positional reads • Various histograms based on this – 75th, 95th percentile, mean, standard deviation, etc., in nanoseconds • TIP: Spikes in h.r.fsReadLatency_avg_time/h.r.fsPreadLatency_avg_time can indicate HDFS/disk/network problems HBaseCon 2012. 5/22/12 14 Copyright 2012 Cloudera Inc. All rights reserved
  • 15. Latency: FS write/sync • h.r.fsWriteLatency_avg_time – average time per HLog edit write in ms • h.r.fsWriteSize_avg_time – average size per HLog edit write in bytes • h.r.fsWriteLatency_num_ops, h.r.fsWriteSize_num_ops – number of HLog edit writes • h.r.fsSyncLatency_avg_time – average time to sync HLogs to the filesystem in ms • h.r.fsSyncLatency_num_ops – number of HLog syncs • h.r.slowHLogAppendCount – number of HLog edits that took longer than 1 second TIP: Search for a WARN-level log message with the text “appending an edit to hlog”; use that time to spelunk through logs to see what else is going on • Various histograms based on this – 75th, 95th percentile, mean, standard deviation, etc., in nanoseconds • TIP: Spikes in write or sync latencies can also be due to HDFS/bad disks/bad network HBaseCon 2012. 5/22/12 15 Copyright 2012 Cloudera Inc. All rights reserved
  • 16. Latency, region: per-operation-type, per-region • From regionserver metrics: h.r.*RequestLatency - latency histograms for various client operations in nanoseconds • Also general RPC metrics for all requests from clients • rpc.metrics.RPCQueueTime is time it takes from receipt of the RPC until it starts being processed • rpc.metrics.RPCProcessingTime is time it takes from start to end of processing • TIP: These are some of the first places to look if your client seems to be running slow • h.r.readRequestsCount – requests for gets, scanners. Not # of rows. • h.r.writeRequestsCount – requests for puts, deletes, etc. Not # of rows. • hbase.RegionServerDynamicStatistics.* contains some metrics on a per- region and per-column-family basis • Subset of block cache, compaction, flush, store/storefile sizes, etc. • TIP: These metrics can be used to answer the questions: • Is the problem only in one region or in all regions? • Is the problem only with one operation type or all operations? HBaseCon 2012. 5/22/12 16 Copyright 2012 Cloudera Inc. All rights reserved
  • 17. Host-level metrics • CPU: • Load averages • % idle/system/user/wio (waiting for block I/O) • Running/total processes • Correlate with “top”, “sar”, “mpstat” to figure out which process is doing what. • TIP: If a process is eating a lot of CPU, that is a clue to look at its logs • Memory: • swap_free/swap_total • cached/free/shared • TIP: Compare with logs, other metrics to help determine causes of OOMs HBaseCon 2012. 5/22/12 17 Copyright 2012 Cloudera Inc. All rights reserved
  • 18. Host-level metrics • Network: • Bytes/packets sent/received • TIP: Correlate with ZK timeouts from logs if RSes are going down • TIP: Also look at Ethernet frame errors from ifconfig if you are experiencing unexpected dips in network traffic • Disk I/O: • Read/write latencies • Slow disks make HBase performance very bad, especially on .META. • TIP: Check dmesg for SCSI errors • disk space available • swap usage • TIP: Should be 0 – otherwise you will have timeouts and RSes going down. Add RAM and/or buy more boxes. HBaseCon 2012. 5/22/12 18 Copyright 2012 Cloudera Inc. All rights reserved
  • 19. Takeaways • Know what you are looking for • Know what metrics are available, what they mean • Collect a lot of different metrics, but don’t drown in them • Metrics are not all you need • Correlate metrics with logs, workload • Metrics are another tool in your toolbox. You still have to do the work to monitor/debug/tune. HBaseCon 2012. 5/22/12 19 Copyright 2012 Cloudera Inc. All rights reserved
  • 20. Takeaways • Take a baseline of your system in steady-state • For later comparison if things go bad • Try to make this as apples-to- apples as possible, e.g. same hardware/workload/config • Spikes or dips from baseline can indicate problems • But also depends on which metric and if you can explain it (e.g. increased workload) HBaseCon 2012. 5/22/12 20 Copyright 2012 Cloudera Inc. All rights reserved
  • 21. Thank you (Especially to Jon Hsieh and Todd Lipcon for their helpful reviews and suggestions) Cloudera is hiring E-mail me: dsw@cloudera.com HBaseCon 2012. 5/22/12 21 Copyright 2012 Cloudera Inc. All rights reserved

Editor's Notes

  1. To understand the steady-state behavior of your HBase cluster:Establish a baselineTo debug when something bad happens:Compare with baselineTo help evaluate changes:ConfigurationWorkloadHardwareHelps point out bottlenecks
  2. Most metrics reside in the region serverMetrics for various categories (e.g. Stores, compactions, flushes)Metrics per operation type (e.g. get, put, delete)Remember HDFS,JVM, and host metrics as well
  3. Use FileContext in hadoop-metrics.properties to have metrics dumped to a file periodicallyweb UIs’ /jmx web pages dump metrics in JSONjconsole for interactive browsingGanglia: has built-in support in Hadoop/HBase, use GangliaContext or GangliaContext31
  4. avg_time metrics name is confusing when you are measuring something that isn’t time. Multiple num_ops metrics that mean the same thing are also confusing. Both are legacy of how code is writtenor use time-based keys, salted keys, reduce data (less columns/smaller column names, compression)
  5. again can change data schema (salted keys, row key composite of time and sha1/md5)
  6. Unsuccessful splits are rolled back. If the rollback fails, the RS aborts.
  7. Stores represent a CF, and can contain one or more StoreFiles. Reads have to go through all of the StoreFiles to get an overall view of the dataset for reads.Bloom filters and time range predicates may preclude storefiles from being culled.
  8. Remember that there is one memstore per store/CF
  9. http://hbase.apache.org/book/regionserver.arch.html contains a thorough explanation of how the block cache is used and what you can do to utilize it better.Turn off block caching for tables that have full table scans
  10. http://hbase.apache.org/book/trouble.log.html#trouble.log.gc explains how to configure GC logging and what they meanhttp://hbase.apache.org/book/jvm.html for what to do about GCs
  11. Network problems can come into play here if HDFS needs to fetch data for the read from somewhere else.
  12. HFile latencies are handled by the compaction and flush metricsHLog edits affect client-facing system latency more visiblyHDFS will have a pipeline per write, so any hiccups along that pipeline in either network or disk will affect latency.
  13. Hotspotting can be solved by methods such as hashing your rowkeys and pre-splitting regionshdfsBlocksLocalityIndex (HBASE-4114) may also be useful as higher indices normally mean lower latencies
  14. Easier to screen out metrics if you know what each of them mean
  15. Keep a library of baselines, changing one thing at a time