I N F L U X D B U N I V E R S I T Y
Intro to InfluxDB
Getting Started Training Series
Brought to you by InfluxDB University
InfluxDB University offers free live and self-paced training on:
• InfluxDB
• Telegraf
• Flux
• Kapacitor
• and more
influxdbu.com
Agenda
• Why do I need a Time Series Database?
• The InfluxDB Platform
• Basic Concepts
• Demo
The age of instrumentation
Instrumentation
of the virtual world
(DevOps)
Sensors
in the physical world
(IoT)
Characteristics of
the data
• Time-stamped
• Generated in regular
(metric) and
irregular (event) time
periods
• Huge volumes
• Real time and time
sensitive
Time series in every application
Infrastructure & data sources
Consumer & Industrial IoT Software Infrastructure
Renewable
& alternative
energy
systems
Manufact
uring &
industrial
platforms
Fleet
management
& telematics
Real-time Applications
Developer
Tools
& APIs
Kubernetes
(K8s)
DevOps
Monitoring
Gaming
Applications
Fintech
Applications
Network
Monitoring
TIME SERIES DATA
Rise of time series as a category
TIME SERIES
RELATIONAL DOCUMENT SEARCH
• Distributed
search
• Logs
• Geo
• High
throughput
• Large
document
• Orders
• Customers
• Records
• Events, metrics, time stamped
• for IoT, analytics, cloud native
Time series is fastest growing
data category by far
Time series
All others
source: DB Engines
Large and Growing Customer Base
1300+
Customers
IOT
OTHER
The InfluxDB Cloud Platform
InfluxDB is 3 things
API &
Toolset
POWERFUL
for real-time apps
HIGH PERFORMANCE MASSIVE
for real-time data
workloads
of cloud & open source
developers
1 2 3
Time Series
Engine
Community &
Ecosystem
Flux Language
Functional data scripting language for query, analysis, action
1. Transform data at storage level
instead of application level
2. Use one language across entire
InfluxDB platform
1
Time Series Engine
Run and grow large data workloads at high volume globally
High cardinality
High Throughput
Batch & stream inputs
High & low fidelity
storage
Global service on 3
clouds
Clustered option for
on-premises
PERFORMANCE FLEXIBILITY RELIABILITY
2
Community & Ecosystem
Meet developers where they already build and operate
3
Cloud
Language & Tool
Enables organizations to make
cost–disruptive decisions
on high volumes
of time-sensitive data
• Designed for time series
analysis
• Easy to share, easy to extend
• Multi data source
• Open Source (MIT license)
• Easy to get started, powerful
to scale
InfluxDB – Time Series Platform
Empowers developers to build IoT, analytics, & monitoring software
Core focus: Developers and Builders
• Developer happiness
• Time to awesome
• Ease of scale-out &
deployment
InfluxDB – Time Series Platform
A powerful api & tool set for building real-time apps
Collect using hundreds
of integrations & OSS
tools
Write/Query in multiple
languages built for
real-time data
Abstract using client
libraries for your
preferred language
Manage applications &
account via the
developer console
INFLUXDATA API & TOOLS
DEVELOPER APPLICATIONS
IoT Transactions Analytics
Get started quickly
with more tools
and less code.
• Rest API
• OSS integrations
• Cloud delivery
Using InfluxDB
Reference Architecture
Data Sources
Application
Workflows
Infrastructure
Insights
Telegraf
Client Libraries
HTTP
Syslog
Kubernetes
Apache Kafka
Python
Arduino
Node.js
JavaScript
Go
Data Systems
Mobile apps
Web apps
Cloud Services
Devices
Sensors
Databases
Networks
Message Queues
APIs
IoT Platforms
CRMs
InfluxDB Platform
IoT
Actions
InfluxDB
Purpose-Built Time Series Database
Visualization, Query & Task Engine
Collect
Downsample
Trigger
Alert
Transform
…
200+ Plugins
20+ Languages
…
New Square
Native Ecosystems
JMeter
NiFi
AWS Kinesis
Azure Event Hubs
GCP PubSub
Java
.NET/C#
PHP
Ruby
Vector
Fluentd
Concepts: Data Model
Bucket
• All InfluxDB data is stored in a bucket. A bucket combines the concept of a database
and a retention period (the duration of time that each data point persists).
Measurement
• A name to a group of data at a high level
Tag set
• A set of key-value pairs to group data at a low level (values are strings)
Field set
• A set of key-value pairs to represent data (values are numerical & strings)
Timestamp
• Time of the data with nanosecond precision
Series
• A unique combination of measure+tags
Line Protocol: Simple but powerful
• Points are written to InfluxDB using the Line Protocol, which
follows the following format:
<measurement>[,<tag-key>=<tag-value>]
[<field-key>=<field-value>]
[unix-nano-timestamp]
Reference: https://docs.influxdata.com/influxdb/cloud/reference/syntax/line-protocol/
Tag Set
hostname=server02, us_west=az
Measurement
cpu_load
Field Set
temp=24.5, volts=7
Timestamp
1234567890000000
Quickly Map your data for ingestion
• Our client libraries include a Point object
• Simply build a point from your data and call write
Query data
query_data_frame()
Zeppelin Notebooks + InfluxData
• A completely open web-based notebook that enables
interactive data analytics.
• Multi-purpose Notebook enables:
• Data Ingestion
• Data Discovery
• Data Analytics
• Data Viz and Collaboration
• Zeppelin InfluxDB interpreter makes querying data even easier
• Built in Apache Spark integration
Wherever your data is, InfluxDB
Cloud has tools to help you
ingest it quickly
Analyzing your data
• As simple as comparing ingested metrics across
hosts/containers
• As complex as your application needs it to be
• Keys to successful Analytics
• Performance - Run calculations close to the data for the best
performance
• Flexibility - Do not hit the limits of your language
Introducing Flux
A functional language designed for querying, analyzing, and acting
on data.
What can you do with Flux?
• Custom aggregations
• Custom functions
• Source/Destinations functions
• Joins
• Math across measurements
• Pivot
• Histograms
• Covariance
• Double and Triple Exponential Smoothing
Examples of Anomaly Detection with Flux
• MAD (median absolute deviation) across multiple series to
detect a series that is “deviating from the pack”
• Writing a Naive Bayes classifier from scratch.
Other Analytics Capabilities
• Supports existing InfluxQL users (simple SQL-like syntax)
• Background processing for custom or pre-calculated metrics
• APIs to build custom analytics
Ingesting data is only valuable if
you can analyze that data
at scale in real-time
Acting on that Data
• Once your data is analyzed,
act on it
• Serve it to your application’s
users
• Alert on your data
Demo
Enable organizations to make
cost–disruptive decisions
on high volumes
of time-sensitive data
Influxdb cloud: the time series platform for your data applications
S U M M A R Y
Learn more at influxdata.com
Come hang out with us!
Slack Community
www.influxdbu.com
Run apps in
production with
absolute confidence
on the only
purpose-built time
series engine.
High Speed Ingest
via both batch &
streaming
Flexible Schema
learns & adapts
as it goes
High & Low Fidelity
retention &
storage
Managed Functions
hosted in
the cloud
A HIGH-PERFORMANCE ENGINE TO HANDLE REAL-TIME
DATA WORKLOADS
INFLUXDATA API & TOOLS
INFLUXDATA REAL-TIME ENGINE
TIME SERIES
DATABASE
INFLUXDB ENTERPRISE
Self Managed, High Availability
& Secure
priced per node
INFLUXDB OSS
Open source
time series database.
High performing, Schemaless,
Smart extraction of data (raw,
sliding, aggregates)
INFLUXDB CLOUD
Elastic Serverless
Time Series as a Service
pay per use
PRODUCT OFFERINGS
•INFLUXDB PLATFORM

Intro to InfluxDB

  • 1.
    I N FL U X D B U N I V E R S I T Y Intro to InfluxDB Getting Started Training Series
  • 2.
    Brought to youby InfluxDB University InfluxDB University offers free live and self-paced training on: • InfluxDB • Telegraf • Flux • Kapacitor • and more influxdbu.com
  • 3.
    Agenda • Why doI need a Time Series Database? • The InfluxDB Platform • Basic Concepts • Demo
  • 4.
    The age ofinstrumentation Instrumentation of the virtual world (DevOps) Sensors in the physical world (IoT)
  • 5.
    Characteristics of the data •Time-stamped • Generated in regular (metric) and irregular (event) time periods • Huge volumes • Real time and time sensitive
  • 6.
    Time series inevery application Infrastructure & data sources Consumer & Industrial IoT Software Infrastructure Renewable & alternative energy systems Manufact uring & industrial platforms Fleet management & telematics Real-time Applications Developer Tools & APIs Kubernetes (K8s) DevOps Monitoring Gaming Applications Fintech Applications Network Monitoring TIME SERIES DATA
  • 7.
    Rise of timeseries as a category TIME SERIES RELATIONAL DOCUMENT SEARCH • Distributed search • Logs • Geo • High throughput • Large document • Orders • Customers • Records • Events, metrics, time stamped • for IoT, analytics, cloud native Time series is fastest growing data category by far Time series All others source: DB Engines
  • 8.
    Large and GrowingCustomer Base 1300+ Customers IOT OTHER
  • 9.
  • 10.
    InfluxDB is 3things API & Toolset POWERFUL for real-time apps HIGH PERFORMANCE MASSIVE for real-time data workloads of cloud & open source developers 1 2 3 Time Series Engine Community & Ecosystem
  • 11.
    Flux Language Functional datascripting language for query, analysis, action 1. Transform data at storage level instead of application level 2. Use one language across entire InfluxDB platform 1
  • 12.
    Time Series Engine Runand grow large data workloads at high volume globally High cardinality High Throughput Batch & stream inputs High & low fidelity storage Global service on 3 clouds Clustered option for on-premises PERFORMANCE FLEXIBILITY RELIABILITY 2
  • 13.
    Community & Ecosystem Meetdevelopers where they already build and operate 3 Cloud Language & Tool
  • 14.
    Enables organizations tomake cost–disruptive decisions on high volumes of time-sensitive data
  • 15.
    • Designed fortime series analysis • Easy to share, easy to extend • Multi data source • Open Source (MIT license) • Easy to get started, powerful to scale InfluxDB – Time Series Platform Empowers developers to build IoT, analytics, & monitoring software
  • 16.
    Core focus: Developersand Builders • Developer happiness • Time to awesome • Ease of scale-out & deployment InfluxDB – Time Series Platform
  • 17.
    A powerful api& tool set for building real-time apps Collect using hundreds of integrations & OSS tools Write/Query in multiple languages built for real-time data Abstract using client libraries for your preferred language Manage applications & account via the developer console INFLUXDATA API & TOOLS DEVELOPER APPLICATIONS IoT Transactions Analytics Get started quickly with more tools and less code. • Rest API • OSS integrations • Cloud delivery
  • 18.
  • 19.
    Reference Architecture Data Sources Application Workflows Infrastructure Insights Telegraf ClientLibraries HTTP Syslog Kubernetes Apache Kafka Python Arduino Node.js JavaScript Go Data Systems Mobile apps Web apps Cloud Services Devices Sensors Databases Networks Message Queues APIs IoT Platforms CRMs InfluxDB Platform IoT Actions InfluxDB Purpose-Built Time Series Database Visualization, Query & Task Engine Collect Downsample Trigger Alert Transform … 200+ Plugins 20+ Languages … New Square Native Ecosystems JMeter NiFi AWS Kinesis Azure Event Hubs GCP PubSub Java .NET/C# PHP Ruby Vector Fluentd
  • 20.
    Concepts: Data Model Bucket •All InfluxDB data is stored in a bucket. A bucket combines the concept of a database and a retention period (the duration of time that each data point persists). Measurement • A name to a group of data at a high level Tag set • A set of key-value pairs to group data at a low level (values are strings) Field set • A set of key-value pairs to represent data (values are numerical & strings) Timestamp • Time of the data with nanosecond precision Series • A unique combination of measure+tags
  • 21.
    Line Protocol: Simplebut powerful • Points are written to InfluxDB using the Line Protocol, which follows the following format: <measurement>[,<tag-key>=<tag-value>] [<field-key>=<field-value>] [unix-nano-timestamp] Reference: https://docs.influxdata.com/influxdb/cloud/reference/syntax/line-protocol/ Tag Set hostname=server02, us_west=az Measurement cpu_load Field Set temp=24.5, volts=7 Timestamp 1234567890000000
  • 22.
    Quickly Map yourdata for ingestion • Our client libraries include a Point object • Simply build a point from your data and call write
  • 23.
  • 26.
  • 28.
    Zeppelin Notebooks +InfluxData • A completely open web-based notebook that enables interactive data analytics. • Multi-purpose Notebook enables: • Data Ingestion • Data Discovery • Data Analytics • Data Viz and Collaboration • Zeppelin InfluxDB interpreter makes querying data even easier • Built in Apache Spark integration
  • 29.
    Wherever your datais, InfluxDB Cloud has tools to help you ingest it quickly
  • 30.
    Analyzing your data •As simple as comparing ingested metrics across hosts/containers • As complex as your application needs it to be • Keys to successful Analytics • Performance - Run calculations close to the data for the best performance • Flexibility - Do not hit the limits of your language
  • 31.
    Introducing Flux A functionallanguage designed for querying, analyzing, and acting on data.
  • 32.
    What can youdo with Flux? • Custom aggregations • Custom functions • Source/Destinations functions • Joins • Math across measurements • Pivot • Histograms • Covariance • Double and Triple Exponential Smoothing
  • 33.
    Examples of AnomalyDetection with Flux • MAD (median absolute deviation) across multiple series to detect a series that is “deviating from the pack” • Writing a Naive Bayes classifier from scratch.
  • 34.
    Other Analytics Capabilities •Supports existing InfluxQL users (simple SQL-like syntax) • Background processing for custom or pre-calculated metrics • APIs to build custom analytics
  • 35.
    Ingesting data isonly valuable if you can analyze that data at scale in real-time
  • 36.
    Acting on thatData • Once your data is analyzed, act on it • Serve it to your application’s users • Alert on your data
  • 37.
  • 38.
    Enable organizations tomake cost–disruptive decisions on high volumes of time-sensitive data Influxdb cloud: the time series platform for your data applications S U M M A R Y
  • 39.
    Learn more atinfluxdata.com Come hang out with us! Slack Community
  • 40.
  • 41.
    Run apps in productionwith absolute confidence on the only purpose-built time series engine. High Speed Ingest via both batch & streaming Flexible Schema learns & adapts as it goes High & Low Fidelity retention & storage Managed Functions hosted in the cloud A HIGH-PERFORMANCE ENGINE TO HANDLE REAL-TIME DATA WORKLOADS INFLUXDATA API & TOOLS INFLUXDATA REAL-TIME ENGINE TIME SERIES DATABASE
  • 42.
    INFLUXDB ENTERPRISE Self Managed,High Availability & Secure priced per node INFLUXDB OSS Open source time series database. High performing, Schemaless, Smart extraction of data (raw, sliding, aggregates) INFLUXDB CLOUD Elastic Serverless Time Series as a Service pay per use PRODUCT OFFERINGS •INFLUXDB PLATFORM