Xephon-K
A lightweight TSDB with multiple backends
Pinglei Guo https://github.com/xephonhq/xephon-k
Agenda
● Overview
● Time Series Data Revisited
● Time Series Database state of the art
● Xephon-K Design
● Xephon-K Implementation
● Evaluation
● Lessons learned
● Related & Future work
● Conclusion
Overview
● Written in Golang (1,700 loc including bench and test)
● Use Cassandra as main backend
● Simple data model
● It is working
Time Series Data Revisited
NOT just data with timestamp
‘What happened, happened
and couldn’t have happened
another way’
- The Matrix
Time Series Data Revisited
Name Saving Update
time
Rabbit $100 2017/03/20
:12:59:33
Tiger $250 2017/03/20
:12:59:33
Name Daily
Transaction
Date
Rabbit +$100, 000 2017/03/19
Rabbit -$99, 900 2017/03/20
Tiger +$125 2017/03/19
Tiger +$125 2017/03/20
Single record, update in place, tell current state
A series of events, immutable, tell the history
Time Series Database state of the art
Xephon-K Cassandra Yes Golang at15 N/A 1
Full list on: https://github.com/xephonhq/awesome-time-series-database
Xephon-K Design
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - Naive schema
metric_name metric_timestamp value
cpu 2017/03/17:13:24:00:20 10.2
cpu 2017/03/17:13:25:00:00 3.3
cpu 2017/03/17:13:26:00:00 5.6
mem 2017/03/17:13:24:00:20 80.3
mem 2017/03/17:13:25:00:00 60.2
mem 2017/03/17:13:26:00:00 90.3
cqlsh> SELECT * FROM metrics
Xephon-K Implementation - Naive schema
name metric_timestamp val
cpu 2017/03/17:13:24:00:20 10.2
cpu 2017/03/17:13:25:00:00 3.3
cpu 2017/03/17:13:26:00:00 5.6
mem 2017/03/17:13:24:00:20 80.3
mem 2017/03/17:13:25:00:00 60.2
mem 2017/03/17:13:26:00:00 90.3
The table is an abstraction of underlying map
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - Internal representation
type IntPoint struct {
T int64
V int
}
type DoublePoint struct {
T int64
V double
}
type IntSeries struct {
Name string
Tags map[string]string
Points []IntPoint
}
type DoubleSeries struct {
Name string
Tags map[string]string
Points []DoublePoint
}
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - In Memory storage
type Data map[SeriesID]*IntSeriesStore
type IntSeriesStore struct {
mu sync.RWMutex
series common.IntSeries
length int
}
type Index []IndexRow
type IndexRow struct {
key string
value string
seriesID SeriesID
}
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - API Write
[
{
"name": "archive_file_tracked",
"tags": {
"host": "server1",
"data_center": "DC1"
},
"points": [
[1359788400000, 123],
[1359788300000, 13],
[1359788410000, 23]
]
}
]
http://localhost:2333/write
{
"points": [
[1359788400000, 123],
[1359788300000, 13],
],
"points": [
{"t": 1359788400000, "v": 123},
{"t": 1359788300000, "v": 13},
]
}
Use array instead of object, all numeric values are number in JSON
Evaluation Environment Setup
● i7-6700 CPU @ 3.40GHz 32 GB RAM HDD Ubuntu 16.10 ( kernel 4.8.0-39 )
● Docker 1.13 without resource limits on container
● InfluxDB 1.2
● KairosDB 1.12 + Cassandra 2.2
● Xephon-K (Go 1.7.4) + Cassandra 3.10
● Write to one series with one tag `cpi{agent:xephon-bench}` with fixed value
● Batch size 100 points, client timeout 30 seconds
● No QPS limit, No retry, No backoff
Evaluation - Throughput
Evaluation - Throughput
Database Total Requests
XKM 12327
XKC 7931
KairosDB 15561
InfluxDB 118
5 seconds, 10 workers
● InfluxDB performance is extremely poor (my bad?)
● KairosDB outperformed Xephon-K (K is from KairosDB …)
● Prometheus can’t be benchmarked (no HTTP API)
Evaluation Analysis
Q: Why InfluxDB is so slow ?
A: Good question, I am still figuring it out (see #15), you can’t blame docker, run it locally results the same
Q: Why KairosDB is faster, Java > Golang ?
● lock
● Buffer (batch size)
Q: That’s it?
A: Bingo! But https://github.com/xephonhq/xephon-k/tree/master/doc/bench
has bunch of results I didn’t dealt with
Q: The chart looks good, what are you using?
A: echarts3 http://echarts.baidu.com/ (One JavaScript a day, Keep Microsoft Excel away)
Lessons learned
● Write ugly code and make things work
● Hardware improve productivity, double the monitor, double the Loc/hr
● Source code is your bestfriend, don’t blindly believe what people say in the
doc, blog, conference, paper, twitter, stackoverflow
Related work
Xephon-B: A TSDB benchmark tool and benchmark result sharing platform
● https://github.com/xephonhq/xephon-b
● Is a never finished course project with @zchen
Reika A DSL for TSDB
● https://github.com/xephonhq/tsdb-proxy-java/tree/master/ql
● Is also a course project two
Xephon-K: I am course project three QvQ
<- Reika
Future work
● Refactor (everyday I am blaming the code of yesterday)
● Storage without Cassandra (yeah, this is course project four)
● Dashboard
● Benchmark driven development using Xephon-B
Acknowledgement
● Zheyuan Chen and Prof. Peter Alvaro for Xephon-B
● Chujiao Hou for Reika
Conclusion
● Time series data is a series of immutable data points, it tells history
● CQL is an illusion created for RDBMS people
● Cassandra is a map of maps that contains maps
● http://echarts.baidu.com/ is a good charting library
● Ugly code works, perfect is the enemy of deadline (well, video games to be honest)
● Xephon-K is awesome
● What people say in their presentation may not be true, use the source, Luke
Thank You!
No question, please, just let me go.

Xephon K A Time series database with multiple backends

  • 1.
    Xephon-K A lightweight TSDBwith multiple backends Pinglei Guo https://github.com/xephonhq/xephon-k
  • 2.
    Agenda ● Overview ● TimeSeries Data Revisited ● Time Series Database state of the art ● Xephon-K Design ● Xephon-K Implementation ● Evaluation ● Lessons learned ● Related & Future work ● Conclusion
  • 3.
    Overview ● Written inGolang (1,700 loc including bench and test) ● Use Cassandra as main backend ● Simple data model ● It is working
  • 4.
    Time Series DataRevisited NOT just data with timestamp ‘What happened, happened and couldn’t have happened another way’ - The Matrix
  • 5.
    Time Series DataRevisited Name Saving Update time Rabbit $100 2017/03/20 :12:59:33 Tiger $250 2017/03/20 :12:59:33 Name Daily Transaction Date Rabbit +$100, 000 2017/03/19 Rabbit -$99, 900 2017/03/20 Tiger +$125 2017/03/19 Tiger +$125 2017/03/20 Single record, update in place, tell current state A series of events, immutable, tell the history
  • 6.
    Time Series Databasestate of the art Xephon-K Cassandra Yes Golang at15 N/A 1 Full list on: https://github.com/xephonhq/awesome-time-series-database
  • 7.
  • 8.
    Xephon-K Implementation ● Naiveschema and Cassandra data model ● Internal representation ● In Memory storage ● API
  • 9.
    Xephon-K Implementation -Naive schema metric_name metric_timestamp value cpu 2017/03/17:13:24:00:20 10.2 cpu 2017/03/17:13:25:00:00 3.3 cpu 2017/03/17:13:26:00:00 5.6 mem 2017/03/17:13:24:00:20 80.3 mem 2017/03/17:13:25:00:00 60.2 mem 2017/03/17:13:26:00:00 90.3 cqlsh> SELECT * FROM metrics
  • 10.
    Xephon-K Implementation -Naive schema name metric_timestamp val cpu 2017/03/17:13:24:00:20 10.2 cpu 2017/03/17:13:25:00:00 3.3 cpu 2017/03/17:13:26:00:00 5.6 mem 2017/03/17:13:24:00:20 80.3 mem 2017/03/17:13:25:00:00 60.2 mem 2017/03/17:13:26:00:00 90.3 The table is an abstraction of underlying map
  • 11.
    Xephon-K Implementation ● Naiveschema and Cassandra data model ● Internal representation ● In Memory storage ● API
  • 12.
    Xephon-K Implementation -Internal representation type IntPoint struct { T int64 V int } type DoublePoint struct { T int64 V double } type IntSeries struct { Name string Tags map[string]string Points []IntPoint } type DoubleSeries struct { Name string Tags map[string]string Points []DoublePoint }
  • 13.
    Xephon-K Implementation ● Naiveschema and Cassandra data model ● Internal representation ● In Memory storage ● API
  • 14.
    Xephon-K Implementation -In Memory storage type Data map[SeriesID]*IntSeriesStore type IntSeriesStore struct { mu sync.RWMutex series common.IntSeries length int } type Index []IndexRow type IndexRow struct { key string value string seriesID SeriesID }
  • 15.
    Xephon-K Implementation ● Naiveschema and Cassandra data model ● Internal representation ● In Memory storage ● API
  • 16.
    Xephon-K Implementation -API Write [ { "name": "archive_file_tracked", "tags": { "host": "server1", "data_center": "DC1" }, "points": [ [1359788400000, 123], [1359788300000, 13], [1359788410000, 23] ] } ] http://localhost:2333/write { "points": [ [1359788400000, 123], [1359788300000, 13], ], "points": [ {"t": 1359788400000, "v": 123}, {"t": 1359788300000, "v": 13}, ] } Use array instead of object, all numeric values are number in JSON
  • 17.
    Evaluation Environment Setup ●i7-6700 CPU @ 3.40GHz 32 GB RAM HDD Ubuntu 16.10 ( kernel 4.8.0-39 ) ● Docker 1.13 without resource limits on container ● InfluxDB 1.2 ● KairosDB 1.12 + Cassandra 2.2 ● Xephon-K (Go 1.7.4) + Cassandra 3.10 ● Write to one series with one tag `cpi{agent:xephon-bench}` with fixed value ● Batch size 100 points, client timeout 30 seconds ● No QPS limit, No retry, No backoff
  • 18.
  • 19.
    Evaluation - Throughput DatabaseTotal Requests XKM 12327 XKC 7931 KairosDB 15561 InfluxDB 118 5 seconds, 10 workers ● InfluxDB performance is extremely poor (my bad?) ● KairosDB outperformed Xephon-K (K is from KairosDB …) ● Prometheus can’t be benchmarked (no HTTP API)
  • 20.
    Evaluation Analysis Q: WhyInfluxDB is so slow ? A: Good question, I am still figuring it out (see #15), you can’t blame docker, run it locally results the same Q: Why KairosDB is faster, Java > Golang ? ● lock ● Buffer (batch size) Q: That’s it? A: Bingo! But https://github.com/xephonhq/xephon-k/tree/master/doc/bench has bunch of results I didn’t dealt with Q: The chart looks good, what are you using? A: echarts3 http://echarts.baidu.com/ (One JavaScript a day, Keep Microsoft Excel away)
  • 21.
    Lessons learned ● Writeugly code and make things work ● Hardware improve productivity, double the monitor, double the Loc/hr ● Source code is your bestfriend, don’t blindly believe what people say in the doc, blog, conference, paper, twitter, stackoverflow
  • 22.
    Related work Xephon-B: ATSDB benchmark tool and benchmark result sharing platform ● https://github.com/xephonhq/xephon-b ● Is a never finished course project with @zchen Reika A DSL for TSDB ● https://github.com/xephonhq/tsdb-proxy-java/tree/master/ql ● Is also a course project two Xephon-K: I am course project three QvQ <- Reika
  • 23.
    Future work ● Refactor(everyday I am blaming the code of yesterday) ● Storage without Cassandra (yeah, this is course project four) ● Dashboard ● Benchmark driven development using Xephon-B
  • 24.
    Acknowledgement ● Zheyuan Chenand Prof. Peter Alvaro for Xephon-B ● Chujiao Hou for Reika
  • 25.
    Conclusion ● Time seriesdata is a series of immutable data points, it tells history ● CQL is an illusion created for RDBMS people ● Cassandra is a map of maps that contains maps ● http://echarts.baidu.com/ is a good charting library ● Ugly code works, perfect is the enemy of deadline (well, video games to be honest) ● Xephon-K is awesome ● What people say in their presentation may not be true, use the source, Luke
  • 26.
    Thank You! No question,please, just let me go.