Update on OpenTSDB and AsyncHBase

HBaseCon
HBaseCon Update
Distributed, Scalable Time Series Database
Chris Larsen clarsen@yahoo-inc.com
Who Am I? (no really, who am I?)
Chris Larsen
Maintainer for OpenTSDB
Software Engineer @ Yahoo!
Monitoring Team
What Is OpenTSDB?
Open Source Time Series Database
Store trillions of data points
Sucks up all data and keeps going
Never lose precision
Scales using HBase, Cassandra
Or Bigtable
What good is it?
Systems Monitoring & Measurement
Servers
Networks
Sensor Data
The Internet of Things
SCADA
Financial Data
Scientific Experiment Results
Use Cases
Backing store for Argus:
Open source monitoring
and alerting system
15 HBase Servers
6 month retention
10M writes per minute
95p query latency < 30 days = 200ms
Moving to 200 node cluster writing at 100M/m
Use Cases
●Monitoring system, network and application
performance and statistics
110 region servers, 10M writes/s ~ 2PB
Multi-tenant and Kerberos secure HBase
~200k writes per second per TSD
Central monitoring for all Yahoo properties
Over 2 billion time series served
Some Other Users
What Are Time Series?
Time Series: data points for an identity
over time
Typical Identity:
Dotted string: web01.sys.cpu.user.0
OpenTSDB Identity:
Metric: sys.cpu.user
Tags (name/value pairs):
host=web01 cpu=0
What Are Time Series?
Data Point:
Metric + Tags
+ Value: 42
+ Timestamp: 1234567890
sys.cpu.user 1234567890 42 host=web01 cpu=0
^ a data point ^
How it Works
Writing Data
1) Open Telnet style socket, write:
put sys.cpu.user 1234567890 42 host=web01 cpu=0
2) ..or, post JSON to:
http://<host>:<port>/api/put
3) .. or import big files with CLI
No schema definition
No RRD file creation
Just write!
Querying Data
Graph with the GUI
CLI tools
HTTP API
Aggregate multiple series
Simple query language
To average all CPUs on host:
start=1h-ago
avg sys.cpu.user
host=web01
HBase Data Tables
tsdb - Data point table. Massive
tsdb-uid - Name to UID and UID to
name mappings
tsdb-meta - Time series index and
meta-data
tsdb-tree - Config and index for
hierarchical naming schema
Data Table Schema
Row key is a concatenation of UIDs and time:
metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
sys.cpu.user 1234567890 42 host=web01 cpu=0
x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02
Timestamp normalized on 1 hour boundaries
All data points for an hour are stored in one row
Enables fast scans of all time series for a metric
…or pass a row key filter for specific time series with
particular tags
New for OpenTSDB 2.2
● Append writes (no more need for TSD
Compactions)
● Row salting and random metric IDs
● Downsampling Fill Policies
● Query filters (wildcard, regex, group by or not)
● Storage Exception plugin for retrying writes
● Released February 2016
New for OpenTSDB 2.3
● Graphite style expressions
● Cross-metric expressions
● Calendar based downsampling
● New data stores
● UID assignment plugin interface
● Datapoint write filter plugin interface
● RC1 released May 2016
● New Committer, Jonathan Creasy
Fuzzy Row Filter
How do you find a single time
series out of 1 million?
For a day?
For a month?
Fuzzy Row Filter
Instead of running a regex
string comparator over each
byte array formatted key…
(?s)^.{9}(?:.{8})*Qx00x00x00x02
E(?:Q)x00x0F‡x42x2BE)(?:.{8})*$
TSDB query takes 1.6 seconds
for 89,726 rows
KEY
Match -> m t1 tagk1 tagv1
No Match -> m t1 tagk1 tagv2
No Match -> m t1 tagk1 tagv1 tagk2 tagv3
No Match -> m t1 tagk1 tagv2 tagk2 tagv4
No Match -> m t1 tagk3 tagv5
No Match -> m t1 tagk3 tagv6
Match -> m t2 tagk tagv1
No Match -> m t2 tagk tagv2
Fuzzy Row Filter
Use a byte mask!
● Use the bloom filter to skip-scan
to the next candidate row.
● Combine with regex (after fuzzy
filter) to filter further.
FuzzyFilter{[FuzzyFilterPair{row_key=[18, 68,
-3, -82, 120, 87, 56, -15, 96, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
mask=[0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0,
1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]}]}
Now it takes 0.239 seconds
KEY
Match -> m t1 tagk1 tagv1
No Match -> m t1 tagk1 tagv2
Skip -> m t1 tagk1 tagv1 tagk2 tagv3
m t1 tagk1 tagv2 tagk2 tagv4
m t1 tagk3 tagv5
m t1 tagk3 tagv6
Match -> m t2 tagk tagv1
No Match -> m t2 tagk tagv2
Fuzzy Row Filter
Pros:
● Can improve scan latency by orders of magnitude
● Combines nicely with other filters
Cons:
● All row keys for the match have to be the same, fixed
length
● Doesn’t help much when matching the majority of a set
OR if a set has uniform key lengths
● Doesn’t support bitmasks, only byte masks
AsyncHBase
AsyncHBase is a fully asynchronous, multi-
threaded HBase client
Supports HBase 0.90 to 1.x
Faster and less resource intensive than the
native HBase client
Support for scanner filters, META prefetch,
“fail-fast” RPCs
AsyncHBase in YCSB
● New Yahoo! Cloud Serving Benchmark (YCSB)
module for testing AsyncHBase
● Test Params:
○ 1 YCSB worker thread with workload A for run and load
○ Ran consecutive Async -> HBase -> Async -> HBase… (new YCSB JVM
each run) 50 times
○ HBase 1.0.0 stock Apache with default configs
○ Local host, Macbook Pro
○ 10K rows written/read
○ Async writes for both
AsyncHBase in YCSB
HBase Client
Threads:
238
AsyncHBase
Client
Threads:
22
AsyncHBase in YCSB
AsyncHBase in YCSB
Upcoming in 1.8
●Reverse Scanning
●Multi-Get requests
●Netty 4
●Lots of bug fixes
○Stuck NSRE bugs
○Region client resource leaks
OpenTSDB on Bigtable
● Bigtable
○Hosted Google Service
○Client uses HTTP2 and GRPC for communication
● OpenTSDB heads home
○Based on a time series store on Bigtable at Google
○Identical schema as HBase
○Same filter support (fuzzy filters are coming)
OpenTSDB on Bigtable
● AsyncBigtable
○Implementation of AsyncHBase’s API for drop-in use
○https://github.com/OpenTSDB/asyncbigtable
○Uses HTable API
○Moving to native Bigtable API
● Thanks to Christos of Pythian, Solomon, Carter, Misha,
and the rest of the Google Bigtable team
● https://www.pythian.com/blog/run-opentsdb-google-
bigtable/#
OpenTSDB on Cassandra
● AsyncCassandra - Implementation of AsyncHBase’s
API for drop-in use
● Wraps Netflix’s Astyanax for asynchronous calls
● Requires the ByteOrderedPartitioner and legacy
API
● Same schema as HBase/Bigtable
● Scan filtering performed client side
● May not work with future Cassandra versions
if they drop the API
Community
Salesforce Argus
●Time series monitoring
and alerting
●Multi-series annotations
●Dashboards
Thanks to Tom Valine and the Salesforce engineers
https://medium.com/salesforce-open-source/argus-time-series-monitoring-and-
alerting-d2941f67864#.ez7mbo3ek
https://github.com/SalesforceEng/Argus
Community
Turn Splicer
●API to shard TSDB queries
●Locality advantage hosting
TSDs on region servers
●Query caching
Thanks to Jonathan Creasy and the Turn engineers
https://github.com/turn/splicer
The Future of OpenTSDB
The Future
Reworked query pipeline for selective ordering
of operations
Histogram support
Flexible query caching framework
Distributed queries
Greater data store abstraction
More Information
Thank you to everyone who has helped test, debug and add to OpenTSDB
2.2 and 2.3 including, but not limited to:
Kyle, Ivan, Davide, Liu, Utkarsh, Andy, Anna, Camden, Can, Carlos, Hugo, Isaih, Kevin, Ping, Jonathan
Contribute at github.com/OpenTSDB/opentsdb
Website: opentsdb.net
Documentation: opentsdb.net/docs/build/html
Mailing List: groups.google.com/group/opentsdb
Images
http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html
http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg
http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg
http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG
https://www.flickr.com/photos/verbeeldingskr8/15563333617
http://www.flickr.com/photos/ladydragonflyherworld/4845314274/
http://lego.cuusoo.com/ideas/view/96
1 of 34

Recommended

HBaseCon 2015: OpenTSDB and AsyncHBase Update by
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon
7.7K views37 slides
Update on OpenTSDB and AsyncHBase by
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
803 views32 slides
OpenTSDB: HBaseCon2017 by
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017HBaseCon
2K views31 slides
OpenTSDB for monitoring @ Criteo by
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
1.2K views81 slides
OpenTSDB 2.0 by
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0HBaseCon
18.3K views41 slides
Rolling Out Apache HBase for Mobile Offerings at Visa by
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa HBaseCon
2.6K views39 slides

More Related Content

What's hot

Keynote: Apache HBase at Yahoo! Scale by
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
5.3K views24 slides
hbaseconasia2017: HBase Practice At XiaoMi by
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
1.8K views45 slides
HBaseCon2017 gohbase: Pure Go HBase Client by
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
1.7K views32 slides
HBaseCon2017 Transactions in HBase by
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
1.8K views111 slides
HBaseCon2017 HBase at Xiaomi by
HBaseCon2017 HBase at XiaomiHBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at XiaomiHBaseCon
1K views40 slides
SignalFx: Making Cassandra Perform as a Time Series Database by
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseDataStax Academy
1.6K views23 slides

What's hot(20)

Keynote: Apache HBase at Yahoo! Scale by HBaseCon
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
HBaseCon5.3K views
hbaseconasia2017: HBase Practice At XiaoMi by HBaseCon
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
HBaseCon1.8K views
HBaseCon2017 gohbase: Pure Go HBase Client by HBaseCon
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon1.7K views
HBaseCon2017 Transactions in HBase by HBaseCon
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
HBaseCon1.8K views
HBaseCon2017 HBase at Xiaomi by HBaseCon
HBaseCon2017 HBase at XiaomiHBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at Xiaomi
HBaseCon1K views
SignalFx: Making Cassandra Perform as a Time Series Database by DataStax Academy
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
DataStax Academy1.6K views
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural... by DataStax
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...
DataStax3.9K views
HBaseCon2017 Improving HBase availability in a multi tenant environment by HBaseCon
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon1.2K views
Kafka Summit SF 2017 - Infrastructure for Streaming Applications by confluent
Kafka Summit SF 2017 - Infrastructure for Streaming Applications Kafka Summit SF 2017 - Infrastructure for Streaming Applications
Kafka Summit SF 2017 - Infrastructure for Streaming Applications
confluent750 views
Managing terabytes: When Postgres gets big by Selena Deckelmann
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
Selena Deckelmann753 views
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc... by Altinity Ltd
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Altinity Ltd3.5K views
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest by HBaseCon
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon646 views
Kafka on ZFS: Better Living Through Filesystems by confluent
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
confluent5.8K views
User Defined Partitioning on PlazmaDB by Kai Sasaki
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDB
Kai Sasaki1.4K views
Benchmarking Apache Samza: 1.2 million messages per sec per node by Tao Feng
Benchmarking Apache Samza: 1.2 million messages per sec per nodeBenchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per node
Tao Feng876 views
Logs @ OVHcloud by OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
OVHcloud469 views
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,... by DataStax
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
DataStax382 views
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust by Altinity Ltd
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Altinity Ltd7.1K views
Writing Applications for Scylla by ScyllaDB
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for Scylla
ScyllaDB1.6K views
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb by PGConf APAC
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC726 views

Viewers also liked

Apache HBase at Airbnb by
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb HBaseCon
5.9K views35 slides
Improvements to Apache HBase and Its Applications in Alibaba Search by
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search HBaseCon
3.6K views19 slides
openTSDB - Metrics for a distributed world by
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldOliver Hankeln
9.7K views46 slides
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace by
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
4.5K views26 slides
Apache HBase in the Enterprise Data Hub at Cerner by
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerHBaseCon
2.1K views72 slides
Apache HBase Improvements and Practices at Xiaomi by
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
4.8K views56 slides

Viewers also liked(20)

Apache HBase at Airbnb by HBaseCon
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb
HBaseCon5.9K views
Improvements to Apache HBase and Its Applications in Alibaba Search by HBaseCon
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search
HBaseCon3.6K views
openTSDB - Metrics for a distributed world by Oliver Hankeln
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed world
Oliver Hankeln9.7K views
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace by HBaseCon
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon4.5K views
Apache HBase in the Enterprise Data Hub at Cerner by HBaseCon
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
HBaseCon2.1K views
Apache HBase Improvements and Practices at Xiaomi by HBaseCon
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
HBaseCon4.8K views
Apache HBase - Just the Basics by HBaseCon
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
HBaseCon4.6K views
Real-time HBase: Lessons from the Cloud by HBaseCon
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
HBaseCon4.5K views
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S... by Cloudera, Inc.
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
Cloudera, Inc.4.7K views
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B... by HBaseCon
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
HBaseCon4.1K views
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase by HBaseCon
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon3.2K views
HBaseCon 2015: HBase Operations in a Flurry by HBaseCon
HBaseCon 2015: HBase Operations in a FlurryHBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon4.1K views
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving by HBaseCon
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon2.6K views
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems by Cloudera, Inc.
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
Cloudera, Inc.6.1K views
HBase Data Modeling and Access Patterns with Kite SDK by HBaseCon
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
HBaseCon4.7K views
Digital Library Collection Management using HBase by HBaseCon
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
HBaseCon3.1K views
HBase at Bloomberg: High Availability Needs for the Financial Industry by HBaseCon
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBaseCon6.7K views
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS by HBaseCon
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon4K views
Content Identification using HBase by HBaseCon
Content Identification using HBaseContent Identification using HBase
Content Identification using HBase
HBaseCon3.8K views
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster by Cloudera, Inc.
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
Cloudera, Inc.7.5K views

Similar to Update on OpenTSDB and AsyncHBase

Argus Production Monitoring at Salesforce by
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce HBaseCon
404 views21 slides
Tweaking perfomance on high-load projects_Думанский Дмитрий by
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийGeeksLab Odessa
10.6K views74 slides
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda... by
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
3.1K views41 slides
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup] by
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Kevin Xu
241 views62 slides
Advanced Apache Cassandra Operations with JMX by
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXzznate
4.9K views131 slides
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum... by
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...DataStax
4.1K views131 slides

Similar to Update on OpenTSDB and AsyncHBase (20)

Argus Production Monitoring at Salesforce by HBaseCon
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
HBaseCon404 views
Tweaking perfomance on high-load projects_Думанский Дмитрий by GeeksLab Odessa
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский Дмитрий
GeeksLab Odessa10.6K views
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup] by Kevin Xu
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Kevin Xu241 views
Advanced Apache Cassandra Operations with JMX by zznate
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
zznate4.9K views
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum... by DataStax
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
DataStax4.1K views
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in... by Rob Skillington
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
Rob Skillington979 views
Tweaking performance on high-load projects by Dmitriy Dumanskiy
Tweaking performance on high-load projectsTweaking performance on high-load projects
Tweaking performance on high-load projects
Dmitriy Dumanskiy2.5K views
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir... by InfluxData
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData7.4K views
Macy's: Changing Engines in Mid-Flight by DataStax Academy
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
DataStax Academy1.4K views
TiDB Introduction - San Francisco MySQL Meetup by Morgan Tocker
TiDB Introduction - San Francisco MySQL MeetupTiDB Introduction - San Francisco MySQL Meetup
TiDB Introduction - San Francisco MySQL Meetup
Morgan Tocker275 views
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex by Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex828 views
Solr Power FTW: Powering NoSQL the World Over by Alex Pinkin
Solr Power FTW: Powering NoSQL the World OverSolr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World Over
Alex Pinkin442 views
TiDB Introduction - Boston MySQL Meetup Group by Morgan Tocker
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup Group
Morgan Tocker302 views
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin... by HostedbyConfluent
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
HostedbyConfluent1.1K views
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming by Yaroslav Tkachenko
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Yaroslav Tkachenko542 views

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes by
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
3.9K views36 slides
hbaseconasia2017: HBase on Beam by
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
1.3K views26 slides
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei by
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
1.4K views21 slides
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest by
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
936 views42 slides
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程 by
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
1.1K views21 slides
hbaseconasia2017: Apache HBase at Netease by
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
754 views27 slides

More from HBaseCon(20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes by HBaseCon
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
HBaseCon3.9K views
hbaseconasia2017: HBase on Beam by HBaseCon
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
HBaseCon1.3K views
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei by HBaseCon
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
HBaseCon1.4K views
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest by HBaseCon
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon936 views
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程 by HBaseCon
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon1.1K views
hbaseconasia2017: Apache HBase at Netease by HBaseCon
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
HBaseCon754 views
hbaseconasia2017: HBase在Hulu的使用和实践 by HBaseCon
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
HBaseCon878 views
hbaseconasia2017: 基于HBase的企业级大数据平台 by HBaseCon
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
HBaseCon701 views
hbaseconasia2017: HBase at JD.com by HBaseCon
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
HBaseCon828 views
hbaseconasia2017: Large scale data near-line loading method and architecture by HBaseCon
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
HBaseCon598 views
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei by HBaseCon
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
HBaseCon683 views
hbaseconasia2017: hbase-2.0.0 by HBaseCon
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
HBaseCon1.8K views
HBaseCon2017 Democratizing HBase by HBaseCon
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
HBaseCon897 views
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase by HBaseCon
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon608 views
HBaseCon2017 Highly-Available HBase by HBaseCon
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
HBaseCon1.1K views
HBaseCon2017 Apache HBase at Didi by HBaseCon
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon996 views
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas... by HBaseCon
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon1.1K views
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase by HBaseCon
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon729 views
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce by HBaseCon
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon359 views
HBaseCon2017 Community-Driven Graphs with JanusGraph by HBaseCon
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon5.8K views

Recently uploaded

DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... by
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...Deltares
11 views32 slides
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... by
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...Deltares
5 views28 slides
Myths and Facts About Hospice Care: Busting Common Misconceptions by
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common MisconceptionsCare Coordinations
5 views1 slide
HarshithAkkapelli_Presentation.pdf by
HarshithAkkapelli_Presentation.pdfHarshithAkkapelli_Presentation.pdf
HarshithAkkapelli_Presentation.pdfharshithakkapelli
11 views16 slides
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...Deltares
14 views23 slides
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols by
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDeltares
7 views23 slides

Recently uploaded(20)

DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... by Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares11 views
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... by Deltares
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
Deltares5 views
Myths and Facts About Hospice Care: Busting Common Misconceptions by Care Coordinations
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common Misconceptions
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by Deltares
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
Deltares14 views
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols by Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 views
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... by sparkfabrik
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
sparkfabrik5 views
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 views
Quality Engineer: A Day in the Life by John Valentino
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the Life
John Valentino6 views
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... by Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 views
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... by Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares7 views
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri825 views
360 graden fabriek by info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info3349238 views
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra... by Marc Müller
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra....NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
Marc Müller38 views
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the... by Deltares
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
Deltares6 views

Update on OpenTSDB and AsyncHBase

  • 1. HBaseCon Update Distributed, Scalable Time Series Database Chris Larsen clarsen@yahoo-inc.com
  • 2. Who Am I? (no really, who am I?) Chris Larsen Maintainer for OpenTSDB Software Engineer @ Yahoo! Monitoring Team
  • 3. What Is OpenTSDB? Open Source Time Series Database Store trillions of data points Sucks up all data and keeps going Never lose precision Scales using HBase, Cassandra Or Bigtable
  • 4. What good is it? Systems Monitoring & Measurement Servers Networks Sensor Data The Internet of Things SCADA Financial Data Scientific Experiment Results
  • 5. Use Cases Backing store for Argus: Open source monitoring and alerting system 15 HBase Servers 6 month retention 10M writes per minute 95p query latency < 30 days = 200ms Moving to 200 node cluster writing at 100M/m
  • 6. Use Cases ●Monitoring system, network and application performance and statistics 110 region servers, 10M writes/s ~ 2PB Multi-tenant and Kerberos secure HBase ~200k writes per second per TSD Central monitoring for all Yahoo properties Over 2 billion time series served
  • 8. What Are Time Series? Time Series: data points for an identity over time Typical Identity: Dotted string: web01.sys.cpu.user.0 OpenTSDB Identity: Metric: sys.cpu.user Tags (name/value pairs): host=web01 cpu=0
  • 9. What Are Time Series? Data Point: Metric + Tags + Value: 42 + Timestamp: 1234567890 sys.cpu.user 1234567890 42 host=web01 cpu=0 ^ a data point ^
  • 11. Writing Data 1) Open Telnet style socket, write: put sys.cpu.user 1234567890 42 host=web01 cpu=0 2) ..or, post JSON to: http://<host>:<port>/api/put 3) .. or import big files with CLI No schema definition No RRD file creation Just write!
  • 12. Querying Data Graph with the GUI CLI tools HTTP API Aggregate multiple series Simple query language To average all CPUs on host: start=1h-ago avg sys.cpu.user host=web01
  • 13. HBase Data Tables tsdb - Data point table. Massive tsdb-uid - Name to UID and UID to name mappings tsdb-meta - Time series index and meta-data tsdb-tree - Config and index for hierarchical naming schema
  • 14. Data Table Schema Row key is a concatenation of UIDs and time: metric + timestamp + tagk1 + tagv1… + tagkN + tagvN sys.cpu.user 1234567890 42 host=web01 cpu=0 x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02 Timestamp normalized on 1 hour boundaries All data points for an hour are stored in one row Enables fast scans of all time series for a metric …or pass a row key filter for specific time series with particular tags
  • 15. New for OpenTSDB 2.2 ● Append writes (no more need for TSD Compactions) ● Row salting and random metric IDs ● Downsampling Fill Policies ● Query filters (wildcard, regex, group by or not) ● Storage Exception plugin for retrying writes ● Released February 2016
  • 16. New for OpenTSDB 2.3 ● Graphite style expressions ● Cross-metric expressions ● Calendar based downsampling ● New data stores ● UID assignment plugin interface ● Datapoint write filter plugin interface ● RC1 released May 2016 ● New Committer, Jonathan Creasy
  • 17. Fuzzy Row Filter How do you find a single time series out of 1 million? For a day? For a month?
  • 18. Fuzzy Row Filter Instead of running a regex string comparator over each byte array formatted key… (?s)^.{9}(?:.{8})*Qx00x00x00x02 E(?:Q)x00x0F‡x42x2BE)(?:.{8})*$ TSDB query takes 1.6 seconds for 89,726 rows KEY Match -> m t1 tagk1 tagv1 No Match -> m t1 tagk1 tagv2 No Match -> m t1 tagk1 tagv1 tagk2 tagv3 No Match -> m t1 tagk1 tagv2 tagk2 tagv4 No Match -> m t1 tagk3 tagv5 No Match -> m t1 tagk3 tagv6 Match -> m t2 tagk tagv1 No Match -> m t2 tagk tagv2
  • 19. Fuzzy Row Filter Use a byte mask! ● Use the bloom filter to skip-scan to the next candidate row. ● Combine with regex (after fuzzy filter) to filter further. FuzzyFilter{[FuzzyFilterPair{row_key=[18, 68, -3, -82, 120, 87, 56, -15, 96, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0], mask=[0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]}]} Now it takes 0.239 seconds KEY Match -> m t1 tagk1 tagv1 No Match -> m t1 tagk1 tagv2 Skip -> m t1 tagk1 tagv1 tagk2 tagv3 m t1 tagk1 tagv2 tagk2 tagv4 m t1 tagk3 tagv5 m t1 tagk3 tagv6 Match -> m t2 tagk tagv1 No Match -> m t2 tagk tagv2
  • 20. Fuzzy Row Filter Pros: ● Can improve scan latency by orders of magnitude ● Combines nicely with other filters Cons: ● All row keys for the match have to be the same, fixed length ● Doesn’t help much when matching the majority of a set OR if a set has uniform key lengths ● Doesn’t support bitmasks, only byte masks
  • 21. AsyncHBase AsyncHBase is a fully asynchronous, multi- threaded HBase client Supports HBase 0.90 to 1.x Faster and less resource intensive than the native HBase client Support for scanner filters, META prefetch, “fail-fast” RPCs
  • 22. AsyncHBase in YCSB ● New Yahoo! Cloud Serving Benchmark (YCSB) module for testing AsyncHBase ● Test Params: ○ 1 YCSB worker thread with workload A for run and load ○ Ran consecutive Async -> HBase -> Async -> HBase… (new YCSB JVM each run) 50 times ○ HBase 1.0.0 stock Apache with default configs ○ Local host, Macbook Pro ○ 10K rows written/read ○ Async writes for both
  • 23. AsyncHBase in YCSB HBase Client Threads: 238 AsyncHBase Client Threads: 22
  • 26. Upcoming in 1.8 ●Reverse Scanning ●Multi-Get requests ●Netty 4 ●Lots of bug fixes ○Stuck NSRE bugs ○Region client resource leaks
  • 27. OpenTSDB on Bigtable ● Bigtable ○Hosted Google Service ○Client uses HTTP2 and GRPC for communication ● OpenTSDB heads home ○Based on a time series store on Bigtable at Google ○Identical schema as HBase ○Same filter support (fuzzy filters are coming)
  • 28. OpenTSDB on Bigtable ● AsyncBigtable ○Implementation of AsyncHBase’s API for drop-in use ○https://github.com/OpenTSDB/asyncbigtable ○Uses HTable API ○Moving to native Bigtable API ● Thanks to Christos of Pythian, Solomon, Carter, Misha, and the rest of the Google Bigtable team ● https://www.pythian.com/blog/run-opentsdb-google- bigtable/#
  • 29. OpenTSDB on Cassandra ● AsyncCassandra - Implementation of AsyncHBase’s API for drop-in use ● Wraps Netflix’s Astyanax for asynchronous calls ● Requires the ByteOrderedPartitioner and legacy API ● Same schema as HBase/Bigtable ● Scan filtering performed client side ● May not work with future Cassandra versions if they drop the API
  • 30. Community Salesforce Argus ●Time series monitoring and alerting ●Multi-series annotations ●Dashboards Thanks to Tom Valine and the Salesforce engineers https://medium.com/salesforce-open-source/argus-time-series-monitoring-and- alerting-d2941f67864#.ez7mbo3ek https://github.com/SalesforceEng/Argus
  • 31. Community Turn Splicer ●API to shard TSDB queries ●Locality advantage hosting TSDs on region servers ●Query caching Thanks to Jonathan Creasy and the Turn engineers https://github.com/turn/splicer
  • 32. The Future of OpenTSDB
  • 33. The Future Reworked query pipeline for selective ordering of operations Histogram support Flexible query caching framework Distributed queries Greater data store abstraction
  • 34. More Information Thank you to everyone who has helped test, debug and add to OpenTSDB 2.2 and 2.3 including, but not limited to: Kyle, Ivan, Davide, Liu, Utkarsh, Andy, Anna, Camden, Can, Carlos, Hugo, Isaih, Kevin, Ping, Jonathan Contribute at github.com/OpenTSDB/opentsdb Website: opentsdb.net Documentation: opentsdb.net/docs/build/html Mailing List: groups.google.com/group/opentsdb Images http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG https://www.flickr.com/photos/verbeeldingskr8/15563333617 http://www.flickr.com/photos/ladydragonflyherworld/4845314274/ http://lego.cuusoo.com/ideas/view/96