1. The document provides an overview of InfluxEnterprise, including its core open source functionality, high availability features, scalability, fine-grained authorization, support options, and on-premise or cloud deployment options.
2. It discusses signs that an organization may be ready for InfluxEnterprise, such as high CPU usage, issues with single node deployments, and needing improved data durability or throughput.
3. The document covers InfluxEnterprise cluster architecture including meta nodes, data nodes, replication patterns, ingestion and query rates for different replication configurations, and examples for mothership, durable data ingest, and integrating with ElasticSearch deployments.
2. What we will be
covering
✓ Enterprise Overview
✓ Other Features
✓ Ingestion & Query Rates
✓ Deployment Examples
✓ Replications Patterns
✓ General Advice
4. InfluxEnterprise
• Open Source Core
• High Availability
• Scalability
• Fine Grained Authorization
• Support from InfluxData
• OnPremise/Cloud Deployment Options
5. Signs You’re Ready for InfluxEnterprise
6. Your CPU average is >=70%
1. The sales team starts calling you on weekends
5. Increasing throughput causing write drops errors
4. Sprawling number of single node deployments
3. Horizontal scaling not providing further benefit
2. Data durability matters
14. Security
• LDAP Support
– Enterprise customers can configure the database to use LDAP as a backing
authentication source for users, roles and permissions.
– Connection between DB and LDAP server secured once connected
• Fine-grained authorization
– Used to control access at a measurement or series level
(compared to limiting access at the database level)
– Enable authentication in your configuration file
– Create users through the query API
– Grant users explicit read and/or write privileges
– Set restrictions which define a combination of database, measurement, and
tags which cannot be accessed without an explicit grant
16. Backup and Restore
• Useful for: Disaster recovery, Debugging, Restoring clusters to a
consistent state
• What it does: Creates a copy of the metastore and the shard data
• Backup is compressed and is not human readable
• Export is not compressed but is human readable
• OSS and InfluxEnterprise ARE NOW compatible – aka portable
• Full or partial backup options
• Move data into a new database (with new Retention Policies, etc)
18. Cluster
Data Node 1 Data Node 2 Data Node 3 Data Node 4
Database X (Replication=1)
Shard 1 Shard 2 Shard 3 Shard 4
a b c d
X ≈ 4x ingest rate
≤ 1x concurrent query rate
a b c d
19. Cluster
Data Node 1 Data Node 2 Data Node 3 Data Node 4
Database X (Replication=4)
Shard 1 Shard 2 Shard 3 Shard 4
a a’ a’’ a’’’
X ≈ 1x ingest rate
replication
≈ 4x concurrent query rate
a
b
b b’ b’’ b’’’
20. Cluster
Data Node 1 Data Node 2 Data Node 3 Data Node 4
Database X (Replication=2)
Shard 1 Shard 2 Shard 3 Shard 4
a a’ b b’
X ≈ 2x ingest rate
replication replication
≤ 2x concurrent query rate
a b
23. Example 1: Mothership
Data Center 1
Kapacitor
Telegraf InfluxDB
Ent
Enterprise Cluster
Data Node 1
Data Node 2
Data Node 3
Data Node n
Firewall/
LoadBalancer
Telegraf
Telegraf
Chronograf
Chronograf Kapacitor
Data Center 2
Kapacitor
Telegraf InfluxDB
EntTelegraf
Telegraf
Chronograf
24. Example 2: Durable Data Ingest
Telegraf Cluster
Telegraf
or other
source
Kafka
Queue
LoadBalancer
InfluxDB
Cluster
Telegraf
or other
source
Telegraf
or other
source
Telegraf
or other
sources
Telegraf
Telegraf
Telegraf
Telegraf
Put each Telegraf instance in
the same Kafka Consumer
Group
How Fast is Fast?
(ex): Six datanodes at 2.5M values per
second
25. Example 3: Influx with ElasticSearch
InfluxDB
Cluster
• Discover trends before and during the Error from metrics
• Perform Root Cause Analysis from Logs
LoadBalancer
Telegraf
ElasticSearch
Include common
Session ID
or other UID
Kapacitor
You
Metrics
Logs
Query using the common
Session ID or UID received
form Alert
28. Data Replication
Generally there are two types of data that we care about replicating:
• New Data – Data which is coming form our raw sources
• Derived Data – The output of a SELECT INTO, or TICK script
29. Replication of New Data – Pattern 1
Cluster 1
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
Cluster 2
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
30. Replication of New Data – Pattern 2
Cluster 1
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
Cluster 2
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
Kafka Queue
31. Replication of Derived Data – Pattern 3
Cluster 1
Load Balancer
Cluster 2
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Kapacitor
Load Balancer
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Kapacitor
Uses output of
Kapacitor to other
cluster
Telegraf Telegraf
33. General Cluster Advice
• Batch your writes!
• The number of data nodes should be a multiple of your replication
factor
• Use a single node of InfluxDB to monitor your cluster
• Put a load balancer in front of each of your data nodes
• Higher replication factors result in higher query concurrency, but
higher write latency.
• Use Fine Grained Authorization instead of multiple databases