Time series storage in Cassandra

Time Series Storage
Cassandra London Meetup
April 7, 2014
Eric Evans
eevans@opennms.org
@jericevans

OpenNMS: What It Is
● Network Management System
○ Discovery and Provisioning
○ Service monitoring
○ Data collection
○ Event management and notifications
● Java, open source, GPLv3
● Since 1999

RRDTool
● Round robin database
● First released 1999
● Time-series storage
● File-based
● Constant-size
● Automatic, amortized aggregation

Consider
● 2 IOPs per update (read-update-write)
● 1 RRD per data source (storeByGroup=false)
● 100,000s of data sources, 1,000s IOPS
● 1,000,000s of data sources, 10,000s IOPS
● 15,000 RPM SAS drive, ~175-200 IOPS

Also
● Not everything is a graph
● Inflexible
● Incremental backups impractical
● ...

Observation #1
We collect and write a great deal; We read
(graph) relatively little.
We are optimized for reading everything,
always.

Observation #2
Samples are naturally collected, and graphed
together in groups.
Grouping samples that are accessed together
is an easy optimization.

Project: Newts
Goals:
● Stand-alone time-series data store
● High-throughput
● Horizontally scalable
● Grouped metric storage/retrieval
● Late-aggregating

Cassandra
Why:
● Write-optimized
● Sorted
● Horizontally scalable (linear)

Gist
● Samples stored as-is.
● Samples can be retrieved as-is.
● Measurements are aggregations calculated
from samples (at time of query).

Sample
{
“resource” : “london”,
“timestamp” : 1396289065,
“name” : “meanTemp”,
“type” : “GAUGE”,
“value” : 17.2,
“attributes” : { “units”: “celsius” }
}

Samples
CREATE TABLE newts.samples (
resource text,
collected_at timestamp,
metric_name text,
metric_type text,
value blob,
attributes map<text, text>,
PRIMARY KEY(resource, collected_at, metric_name)
);

Behind the scenes...
london (2014-03-31 18:04:25, dewPoint):
0xc01a0000
(2014-03-31 18:04:25, maxTemp):
0x40280000
...
Ascending Order

http://github.com/OpenNMS/newts

Time series storage in Cassandra

Time series storage in Cassandra

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Time series storage in Cassandra

Similar to Time series storage in Cassandra (20)

More from Eric Evans

More from Eric Evans (9)

Recently uploaded

Recently uploaded (20)

Time series storage in Cassandra