HBaseCon 2015: OpenTSDB and AsyncHBase Update

5,173 views

Published on

OpenTSDB continues to scale along with HBase. A number of updates have been implemented to push writes over 2 million data points a second. Here we will discuss about HBase schema improvements, including salting, random UI assignment, and using append operations instead of puts. You'll also get AsyncHBase development updates about rate limiting, statistics, and security.

Published in: Software

HBaseCon 2015: OpenTSDB and AsyncHBase Update

  1. 1. OpenTSDB 2.x Distributed, Scalable Time Series Database Benoit Sigoure tsunanet@gmail.com Chris Larsen clarsen@yahoo-inc.com
  2. 2. Who We Are Benoit Sigoure ● Created OpenTSDB at StumbleUpon ● Software Engineer @ Arista Networks Chris Larsen ● Release manager for OpenTSDB 2.x ● Software Engineer @ Yahoo
  3. 3. What Is OpenTSDB? ● Open Source Time Series Database ● Store trillions of data points ● Sucks up all data and keeps going ● Never lose precision ● Scales using HBase
  4. 4. What good is it? ● Systems Monitoring & Measurement o Servers o Networks ● Sensor Data o The Internet of Things o SCADA ● Financial Data ● Scientific Experiment Results
  5. 5. Use Cases ● > 100 Region Servers ~ 30TB ● 60 TSDs ● 600,000 writes per second ● Primary data store for operational data ● Telemetry from StatsD, Ostrich and derived data from Storm
  6. 6. Use Cases ● Monitoring application performance and statistics ● 50 region servers, 2.4M writes/s ~ 200TB ● Multi-tenant and Kerberos secure HBase ● ~200k writes per second per TSD ● Central monitoring for all Yahoo properties ● Over a billion time series served
  7. 7. Some Other Users ● Box: 23 servers, 90K wps, System, app network, business metrics ● Limelight Networks: 8 servers, 30k wps, 24TB of data ● Ticketmaster: 13 servers, 90K wps, ~40GB a day
  8. 8. What Are Time Series? ● Time Series: data points for an identity over time ● Typical Identity: o Dotted string: web01.sys.cpu.user.0 ● OpenTSDB Identity: o Metric: sys.cpu.user o Tags (name/value pairs): host=web01 cpu=0
  9. 9. What Are Time Series? Data Point: ● Metric + Tags ● + Value: 42 ● + Timestamp: 1234567890 sys.cpu.user 1234567890 42 host=web01 cpu=0 ^ a data point ^
  10. 10. How it Works
  11. 11. Writing Data 1) Open Telnet style socket, write: put sys.cpu.user 1234567890 42 host=web01 cpu=0 2) ..or, post JSON to: http://<host>:<port>/api/put 3) .. or import big files with CLI ● No schema definition ● No RRD file creation ● Just write!
  12. 12. Querying Data ● Graph with the GUI ● CLI tools ● HTTP API ● Aggregate multiple series ● Simple query language To average all CPUs on host: start=1h-ago avg sys.cpu.user host=web01
  13. 13. HBase Data Tables ● tsdb - Data point table. Massive ● tsdb-uid - Name to UID and UID to name mappings ● tsdb-meta - Time series index and meta-data ● tsdb-tree - Config and index for hierarchical naming schema
  14. 14. Data Table Schema ● Row key is a concatenation of UIDs and time: o metric + timestamp + tagk1 + tagv1… + tagkN + tagvN ● sys.cpu.user 1234567890 42 host=web01 cpu=0 x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02 ● Timestamp normalized on 1 hour boundaries ● All data points for an hour are stored in one row ● Enables fast scans of all time series for a metric ● …or pass a row key filter for specific time series with particular tags
  15. 15. Salting 1 Metric + Tag Cardinality > 1 Million + 1 Write per Series, per Second = 1 Sad Region
  16. 16. Salting ● Hash on the metric and tags ● Modulo into one of 20 buckets ● Prepend bucket ID to key ● Writes now hit 20 regions/servers sys.cpu.user 1234567890 host=web01 x00x00x00x01x49x95xFBx70x00x00x01x00x00x01 sys.cpu.user 1234567890 host=web02 x01x00x00x01x49x95xFBx70x00x00x01x00x00x02
  17. 17. Salting ● 1M writes per second to 50K per second = But wait, there’s more! ● Query with 20 asynchronous scanners... 20 Happy Regions
  18. 18. Salting
  19. 19. Appends ● Row holds one hour of data o 60 columns @ 1 minute resolution o 3,600 columns @ 1 second resolution o 3,600,000 columns @ millisecond resolution ● 3.6M columns + row key + timestamp = network overhead
  20. 20. Appends ● Qualifiers encode ○ offset from row base time ○ data type (float or integer) ○ value length (1 to 8 bytes) ● Timestamp = 4 bytes ● Row key >= 13 bytes to 55 bytes ● 1 serialized column >= 20 to 69 bytes [ 0b00001111, 0b11000111 ] <---------------->^<-> delta (seconds) type value length
  21. 21. Appends Compact it into one column! ● Concatenate qualifiers ● Concatenate values ● 1 row key, 1 timestamp ● 1 row from 69K to 14K, 83% savings Qualifier: [ 0b00000000, 0b00000000, ... 0b00000000, 0b00010000 ] Value: [ 0b00000001, ... 0b00000002 ] Key: [ x00x00x00x01x49x95xFBx70x00x00x01x00x00x01 ] Timestamp: [ x49x95xFBx70 ]
  22. 22. Appends After each hour TSDs iterate over rows written: ● Read row from HBase (via a Get) ● Compact in memory ● Write new column to HBase (via a Put) ● Delete all old columns (via a MultiAction) Drawbacks: ● Network traversal ● HBase RPC queue ● Cache busting
  23. 23. Appends Try HBase Appends ● Qualifier special prefix ● Concatenate offsets and values in value array ● Still one key, one timestamp, one column Qualifier: [ 0b00000004 ] Value: [ 0b00000000, 0b00000000, 0b00000001, <-Offset/type/length-> <-value -> ...0b00000000, 0b00010000, 0b00000002 ]
  24. 24. Appends Reduces RPC count and network traffic but increases region server CPU usage <-------- Appends -------><-------------- Puts ------------->
  25. 25. Storage Exception Plugin What to do during... ● Region Splits ● Region Moves ● Server Crashes ● AsyncHBase queues retries NSREs ● Requeue into message buss: Kafka ● Spool on disk: RocksDB
  26. 26. Downsampling ● Previous timestamps based on first value ● Now snap to proper modulo buckets ● Fill missing values with NaN, Null or zeros during emission
  27. 27. New for OpenTSDB 2.1 ● Improved compaction code path ● Last value query ● Meta table based lookup filtering ● Read/write TSD modes ● Preload UIDs ● fsck utility update ● CORS support
  28. 28. New for OpenTSDB 2.2 ● Salting ● Appends ● Random UID Assignment ● NaN, Null or Zero fill policy during downsampling ● Fully async query path ● Query tracking and statistics ● Additional thread, JVM and AsyncHBase stats
  29. 29. OpenTSDB Community Bosun - Monitoring system based on OpenTSDB from the folks at Stack Exchange
  30. 30. OpenTSDB Community Graphana - Kibana and elasticsearch based front-end for OpenTSDB, Graphite and InfluxDB
  31. 31. AsyncHBase 1.7 ● AsyncHBase is a fully asynchronous, multi- threaded HBase client ● Supports HBase 0.90 to 1.0 ● Remains 2x faster than HTable in PerformanceEvaluation ● Support for scanner filters, META prefetch, “fail-fast” RPCs
  32. 32. New for 1.7 ● Secure RPC support ● Per region client statistics ● RPC timeouts ● Atomic AppendRequest ● Unit tests!
  33. 33. GoHBase
  34. 34. GoHBase ● Prototype for a new pure-Go HBase client ● Apache 2.0 License ● https://github.com/tsuna/gohbase ● Still early on but feel free to help
  35. 35. The Future of OpenTSDB
  36. 36. The Future ● New query language with support for o Complex filters o Cross metric expressions o Data manipulation / Analysis ● Distributed queries ● Quality/Confidence measures
  37. 37. More Information Thank you to everyone who has helped test, debug and add to OpenTSDB 2.1 and 2.2 including, but not limited to: John Tamblin, Jesse Chang, Rajesh Gopalakrishna Pillai, Siddartha Guthikonda, Sean Miller, Aveek Misra, Ashwin Ramachandrain, Francis Liu, Slawek Ligus, Gabriel Avellaneda, Nick Whitehead ● Contribute at github.com/OpenTSDB/opentsdb ● Website: opentsdb.net ● Documentation: opentsdb.net/docs/build/html ● Mailing List: groups.google.com/group/opentsdb Images ● http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html ● http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg ● http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg ● http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG ● http://pixabay.com/en/compression-archiver-compress-149782/ ● https://openclipart.org/detail/201862/kerberos-icon ● http://lego.cuusoo.com/ideas/view/96

×