Dean Sheehan [InfluxData] | InfluxDB Time Series Engine Overview | InfluxDays 2022

Overview of the
InﬂuxDB Time Series
Engine
Dean Sheehan - EMEA Field CTO, InﬂuxData

Hear from InfluxData’s Field CTO as he provides an
overview of the InfluxDB Time Series Engine. Learn more
about the time series merge tree (TSM) and how it
relates to the API, Flux, and Tasks. In the session, he will
also provide a sneak peek into the future of the InfluxDB
Time Series Engine.
Dean Sheehan
EMEA Field CTO, InfluxData
As Field CTO, Dean is responsible for ensuring the successful
communication and deployment of InfluxData’s solutions throughout the
world. Based in the UK, Dean is also leading the expansion of
InfluxData’s business throughout Europe. He has more than 25 years of
experience in the technology industry covering consulting, product
development, product management and solution deployment throughout
the retail, financial and telecom industries—with significant expertise in
distributed systems, transactional systems and data center automation.
Dean has a Bachelor’s Degree in Computer Science, and an MBA from
Cambridge University.
InfluxDB Time Series Engine
Overview

Agenda
1. Intro to the InﬂuxDB Time Series Engine
2. TSM and the API
3. TSM and Flux
4. TSM and Tasks
5. The future of the InﬂuxDB Time Series Engine

Intro to the InﬂuxDB Time Series Engine
• At the core of the InﬂuxDB time series database is the Time
Structured Merge Tree (TSM) storage engine (& format)
• Purpose-built for storing time series data
• Battle tested over many years by a large community of users in a
multitude of scenarios
• Designed to continuously ingest large volumes of new data points
whilst also running real-time queries

Performance
Cardinality 1.3.9 (inmem) 1.5.0 (inmem) 1.5.0 (tsi1)
1M 140K s/sec 140K s/sec 188K s/sec
Series Creation Performance on m4.2xlarge
Series Creation Performance on Threadripper
Cardinality 1.3.9 (inmem) 1.5.0 (inmem) 1.5.0 (tsi1)
• Seriously fast ingest
• It has improved over time
• It has improved with TSI1
over INMEM
• Not sensitive to how much
data
• TB/PB don’t really care
• is sensitive to how many
unique series (cardinality) are
being recorded
• but wait…

Time Structured Merge-Tree (TSM)
• Draws on Log Structured Merge-Tree structure and algorithms
• Organized around time, series and fields (columnar format)
• Data blocks go through compaction and compression stages as
they become colder (less likely to be written to)
• Different compression algorithms for different column types (and
dynamic)
• Columnar format is very compression friendly
• Proprietary binary format

The API and TSM: One API to rule them all
• InfluxDB API exposes the data in the storage layer to users
• ingest (line protocol), query (InfluxQL & Flux), process (Flux Tasks)
• The API is consistent between InfluxDB OSS, Enterprise self-managed
clusters and our Cloud Service
• Move between them as needed
• We have users on our cloud service that then need to support an air-gapped
customer
• We have customers that as building new ventures on single board computers and
have visions of aggregating in a central location
• Move or blend, even synchronise, as needed according to your changing needs.

InﬂuxDB API Facets
Data
Input
Query
Automate
Platform
Management
Analyze
Transform
Alert
Downsample
Trigger
InﬂuxQL
API
API
API
API

Flux and the TSM engine
• Flux is functional language (i.e. can do queries, analytical
transformations and perform actions e.g. http.post())
• Powerful, ﬂexible, easy to write, easy to read…
• ‘Pushdowns’ push computational work down towards the
storage speeding up queries
• Flux isn’t just for querying the data held in TSM. Flux allows you
to codify background process that can do all manner of things

Tasks and the TSM: Your Heavy Lifting Friend
• Perform transformations &
operations on raw data
(Downsampling &
Precomputing)
• Monitor and look for
conditions to trigger actions
• Headless, automated &
scheduled
Raw
Data
Transformed or
Downsampled Data
or
Actionable insights

The Next Generation InfluxDB Time Series Engine
• Openness : persist using Apache Parquet files in Object Storage
• Speed: in-memory columnar using Apache Arrow
• Access: polyglot language support
• Native SQL - enabling the installed ecosystem and tools
• Flux - flexibility & extensibility (Flux can do more than query).
• InfluxQL - allowing existing workloads to move forward and benefit
• Scale: unlock ludicrous cardinality

API Tier
Catalog
Ingestor
Kafka
Ingestor
Ingester
Querier
Querier
Querier (SQL)
Compacter
Compacter
Compacter
Object Store
InﬂuxQL
Flux
Queries
Writes
Powered by IOx
Parquet

Additional Resources
Free InfluxDB: OSS or Cloud - influxdata.com/cloud
Forums: community.influxdata.com
Slack: influxcommunity.slack.com
Reddit: r/InfluxDB
Influx Community (GH): github.com/InfluxCommunity
Book: awesome.influxdata.com
Docs: docs.influxdata.com
Blogs: influxdata.com/blog
InfluxDB University: influxdata.com/university
How-to guides: docs.influxdata.com/resources/how-to-guides/

Dean Sheehan [InfluxData] | InfluxDB Time Series Engine Overview | InfluxDays 2022

Recommended

Recommended

More Related Content

Similar to Dean Sheehan [InfluxData] | InfluxDB Time Series Engine Overview | InfluxDays 2022

Similar to Dean Sheehan [InfluxData] | InfluxDB Time Series Engine Overview | InfluxDays 2022 (20)

More from InfluxData

More from InfluxData (20)

Recently uploaded

Recently uploaded (20)

Dean Sheehan [InfluxData] | InfluxDB Time Series Engine Overview | InfluxDays 2022