INTERFACE by apidays 2023
APIs for a “Smart” economy. Embedding AI to deliver Smart APIs and turn into an exponential organization
June 28 & 29, 2023
Data Collection Basics
Anais Dotis-Georgiou, Lead Developer Advocate at InfluxData
------
Check out our conferences at https://www.apidays.global/
Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8
Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io
Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/
INTERFACE by apidays 2023 - Data Collection Basics, Anais Dotis-Georgiou, InfluxData
1. I N F L U X D B U N I V E R S I T Y
Data Collection Basics
Getting Started Training Series
Anais Dotis-Georgiou
Developer Advocate, InfluxData
2.
3. Brought to you by InfluxDB University
InfluxDB University offers
free live and self -paced
training on:
• InfluxDB
• Telegraf
• Flux
• Kapacitor
• and more
Scan to explore
the course catalog
influxdbu.com
4. Agenda
• What is Telegraf
• Plugin Ecosystem
• Getting Started with Telegraf
• Extending the Ecosystem
5.
6. Characteristics
of the data
• Time-stamped
• Generated in regular
(metric) and
irregular (event) time
periods
• Huge volumes
• Real time and time
sensitive
7. InfluxDB is 3 things
API &
Toolset
POWERFUL
for real-time apps
HIGH PERFORMANCE MASSIVE
for real-time data
workloads
of cloud &open source
developers
1 2 3
Time Series
Engine
Community &
Ecosystem
10. Data Collection Options
• 300+ Telegraf
plugins
• Regular cadence of
releases
• Why use it?
○ No code
○ Large community
○ Lightweight but
powerful
○ Customizable
Agent-based Push
(aka Telegraf) Client Libraries
Agentless Pull
(aka Scrapers) Native/Ecosystem
• 12 Libraries: Python,
C#, Java, GO,
Javascript/Node.js,
Ruby, PHP, et. al.
• Handles batching,
chunking, setting
right headers, etc.
• Why use them?
○ Easy way to get
started
○ Need libraries
when building
custom
applications
• Prometheus scraper
(OSS only)
• Flux
prometheus.from
• Flux csv.from(url)
• Why use them?
○ Get data in quickly
○ Doesn’t require
agent downloads
on monitoring
device
• Source system
speaks line protocol
• Examples: JMeter,
NiFi, Vector, Fluentd
• Influx CLI CSV Import
• Why use them?
○ Know what you
want to monitor,
quick and easy
integration
11. Telegraf provides the benefits of…
• Low/No code
• Robust scheduler
• High-speed ingestion
• Full-streaming support
• Metric routing
• Flexible parsing, formatting, serializing
• Customizable and Extensible (ExecD
plugins, Starlark)
Instead of…
• Writing long data scraping scripts
• Worrying about unreliable data
collection
• Trouble scaling your data collection
• Resulting in messy data
• Having a lot of unnecessary data in
your database
12. Telegraf:
Agent for Collecting Metrics &Events
Plugin-driven server agent
for collecting and reporting
metrics
• Written in Go
• Single Binary, No external
dependencies
• Minimal memory footprint
• Optimized for writing to
InfluxDB
• Optimized for streaming data
Telegraf
HTTP
Syslog
Kubernetes
Apache Kafka
InfluxDB
Purpose-Built Time Series Database
Collect
Downsample
Transform
300+ Plugins
AWS Kinesis
Azure Event Hubs
GCP PubSub
13. Core Telegraf functionality
• Robust scheduler
• Adjustments for clock-drift
• Adjustments for job scheduling issues that may occur
• In-memory metric buffers
• Metric tracking with flow back-pressure in plugins like Kafka
• Full-streaming support
• Metric routing: name &field pass &drop
• Flexible parsing, formatting, serializing
23. One Telegraf, Multiple Plugins
InfluxDB
File
Kafka
CloudWatch
CPU
Mem
Disk
Docker
Kubernetes
/metrics
Kafka
MySQL
CloudWatch
InfluxDB
Purpose-Built Time Series Database
Collect
Downsample
Transform
Input Process Aggregate Output
- mean
- min,max
- count
- variance
- stddev
- transform
- decorate
- filter
Data Systems
Data Sources
24. Can’t find the plugin you need?
Telegraf is 100% open
source with a strong
community of contributors
It’s easy to write your own Telegraf
plugin!
1. Follow the contribution guide
for Go
2. Write your plugin in any
language and run it externally
with ExecD
29. What’s in a configuration file?
• The Telegraf config file
needs to be specified for
Telegraf agent to operate
properly.
• It contains setup for the
agent, global tags, and
enabled outputs (through
commenting out or removing
unnecessary lines)
32. You can extend Telegraf by:
• Work with the open-source community or submit upgrades or
enhancements for existing plugins
• Use ExecD to write a plugin in Go or a language of your choice
• Starlark processor: calls a Starlark function for each matched
metric, allowing for custom programmatic metric processing
• Math operations
• String operations
• Renaming tags
• Logic operations
33. External plugins via ExecD
• Plugin runs in its own process
• Requires line protocol
• Avoid the need for review by Telegraf team
• Supports the same API as an internal plugin
• Can use for non-GO plugins
• Can use for licensed software plugins
• Can use for any type of plugin (input, output, processor,
aggregator)
36. Customer Quotes
“Telegraf is like a swiss army knife
for connecting various MQTT
sources and OPC UA sources.”
—Fr. Ant. Niedermayr
“Our next-generation pipeline
takes advantage of Kafka and
the Telegraf streaming service
to create a more robust data
topology. Essentially this allows
us to explicitly implement the
four R’s: routability, retention,
resilience, and redundancy.”
—Wayfair
37. Get involved with Telegraf
Telegraf GitHub: github.com/influxdata/telegraf
Community Slack: influxdata.com/slack
• #telegraf
• #telegraf-dev
Community Website: community.influxdata.com
41. Keep Learning with InfluxDB University
Gain skills and and earn shareable
badges from InfluxDB University:
• InfluxDB
• Telegraf
• Flux
• Kapacitor
• and more