Time Series Data with InfluxDB

Working with time series
data with InﬂuxDB
Paul Dix
@pauldix
paul@inﬂuxdb.com

Two kinds of time series
data…

Regular time series
t0 t1 t2 t3 t4 t6 t7
Samples at regular intervals

Irregular time series
t0 t1 t2 t3 t4 t6 t7
Events whenever they come in

Inducing a regular time series
from an irregular one
query: select count(customer_id) from events
where time > now() - 1h
group by time(1m), customer_id

Data that you ask
questions about over time

InﬂuxDB is an open
source distributed time
series database
* still working on the distributed part

Why would you want a
database for time series
data?

Example from DevOps
• 2,000 servers, VMs, containers, or sensor units
• 200 measurements per server/unit
• every 10 seconds
• = 3,456,000,000 distinct points per day

Sharding Data
usually requires application level code

Data retention
application level code and sharding

Retention policies
automatically managed data retention

Continuous queries
for rollups and aggregation

HTTP API - 2 endpoints
/write?db=mydb&rp=fooWrite: HTTP POST

HTTP API - 2 endpoints
/write?db=mydb&rp=foo
/query?db=mydb&rp=foo&q=
Write: HTTP POST
Read: HTTP GET

InﬂuxDB Schema
• Measurements (e.g. cpu, temperature, event,
memory)

InﬂuxDB Schema
memory)
• Tags (e.g. region=uswest, host=serverA,
sensor=23)

InﬂuxDB Schema
memory)
sensor=23)
• Fields (e.g. value=23.2, info=‘this is some extra
stuff`, present=true)

InﬂuxDB Schema
memory)
sensor=23)
• Fields (e.g. value=23.2, info=‘this is some extra
stuff`, present=true)
• Timestamp (nano-second epoch)

All data is indexed by
measurement, tagset,
and time

Inﬂux CLI
$ ./influx
Connected to http://localhost:8086 version 0.9
InfluxDB shell 0.9
>

Create a database
CREATE DATABASE foo

Create a retention policy
CREATE RETENTION POLICY <rp-name> ON <db-name>
DURATION <duration> REPLICATION <n> [DEFAULT]

CREATE RETENTION POLICY high_precision ON mydb
DURATION 7d REPLICATION 3 DEFAULT

CREATE RETENTION POLICY high_precision ON mydb
DURATION 7d REPLICATION 3 DEFAULT
Writes will go into this RP unless
otherwise speciﬁed

Inverted index
of measurements and tags

Discovery
SHOW MEASUREMENTs
SHOW MEASUREMENTS where host = 'serverA'

Discovery
SHOW MEASUREMENTs
SHOW TAG KEYS

Discovery
SHOW MEASUREMENTs
SHOW TAG KEYS
SHOW TAG KEYS from CPU

Discovery
SHOW MEASUREMENTs
SHOW TAG KEYS
SHOW TAG VALUES from CPU WITH KEY = 'region'

Discovery
SHOW MEASUREMENTs
SHOW TAG KEYS
SHOW SERIES

Discovery
SHOW MEASUREMENTs
SHOW TAG KEYS
SHOW SERIES
SHOW SERIES where service = 'redis'

SQL-ish
select * from some_series
where time > now() - 1h

Aggregates
select percentile(90, value) from cpu
where time > now() - 1d
group by time(10m)

Aggregates
select percentile(90, value) from cpu
where time > now() - 1d
group by time(10m), region
Group by a tag

Where against Regex (ﬁeld)
select value from some_log_series
where value =~ /.*ERROR.*/ and
time > "2014-03-01" and time < "2014-03-03"

Where against Regex (tag)
select value from some_log_series
where host =~ /.*asdf.*/ and
time > "2014-03-01" and time < “2014-03-03"
group by host

Functions
min
max
percentile
ﬁrst
last
stddev
mean
count
sum
median
distinct
count(distinct)
more soon: difference, histogram, moving_average

Continuous queries
CREATE CONTINUOUS QUERY "10m_event_count"
ON mydb
BEGIN
SELECT count(value)
INTO "6_months".events
FROM events
GROUP BY time(10m)
END;

More coming
• Compression
• Clustering
• Custom functions

Thank you!
Paul Dix
@pauldix
paul@inﬂuxdb.com

Time Series Data with InfluxDB

More Related Content

What's hot

Viewers also liked

Similar to Time Series Data with InfluxDB

More from Turi, Inc.

Recently uploaded

Time Series Data with InfluxDB