Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Barbara Nelson [InfluxData] | Best Practices for Data Ingestion into InfluxDB | InfluxDays Virtual Experience London 2020

There are many ways to get your time series data into InfluxDB. You can use Telegraf plugins, a client API, or use Flux to upload your data. This talk will cover the pros and cons of each approach. We will then dive into Telegraf (our most flexible approach to loading data into InfluxDB, supporting over 200 plugins today). We will show you some great examples of how our customers are using Telegraf to retrieve, process and output time series data. If you are relatively new to Telegraf, you will come away with a better understanding of how to leverage Telegraf in your environment to seamlessly upload your data to InfluxDB. If you’ve been using Telegraf for a while, you may discover some new Telegraf capabilities that you weren’t aware of. You may even get inspired to contribute to Telegraf to make it even better.

  • Be the first to comment

  • Be the first to like this

Barbara Nelson [InfluxData] | Best Practices for Data Ingestion into InfluxDB | InfluxDays Virtual Experience London 2020

  1. 1. Barbara Nelson Data Ingestion into InfluxDB
  2. 2. Why is data ingestion so important?
  3. 3. Users get the most value when data can be processed Data in its raw form has limited value. Analyzing data transforms it into knowledge. The less time a user spends getting data into InfluxDB, the more time they can spend on acquiring knowledge from the data. Cloud 2 user survey: NPS doubled for users who had completed the data ingestion step and were able to start deriving value from their data.
  4. 4. Data Ingestion Approaches
  5. 5. Agent-based Push (Telegraf) InfluxDBTelegraf Outpu t plug- in Input plug- in Data Source
  6. 6. Telegraf • 200+ Telegraf plugins • Input • Output • Processor • Aggregator • Regular cadence of releases (quarterly feature release, monthly bug fix release) • Highly configurable - No coding required • Large community of contributors
  7. 7. Data Source Client API InfluxDB Client API
  8. 8. Client API • 9 Libraries: • Python • C# • Java • GO • Javascript/Node.js • Ruby • PHP • Scala • Kotlin • Handles batching, chunking, setting headers, etc. • Best approach when building custom applications
  9. 9. Agentless Pull (Scrapers) InfluxDBFlux .fro m Data Source
  10. 10. Agentless-pull (Scrapers) • Prometheus scraper (OSS only) • Flux *.from() • Doesn’t require agent
  11. 11. Data Source Native InfluxDBLine protocol
  12. 12. Native/Ecosystem • Source system speaks line protocol • Examples: JMeter, NiFi, Vector, FluentD • Influx CLI CSV Import • Quick and easy integration
  13. 13. Telegraf
  14. 14. Telegraf • Very popular open source project for collecting metrics from wide variety of data sources, writing to wide variety of data sinks • Database: Connect to data sources like MongoDB, MySQL, Redis, and others to collect and send metrics. • Systems: Collect metrics from cloud platforms, containers, and orchestrators. • IoT sensors: Collect critical stateful data (pressure levels, temp levels, etc.) from IoT sensors and devices. • Source at https://github.com/influxdata/telegraf
  15. 15. Telegraf • Plugin-based architecture • Input – e.g. system, docker, kafka_consumer, Prometheus, Kubernetes, snmp, influxdb_listener (> 170 input plugins) • Processor – e.g. regex, dedup • Aggregator – e.g minmax • Output – e.g. influxDB, file, kafka_producer, http (> 30 output plugins) • Telegraf will buffer data (up to configurable memory limit) • Telegraf can batch data
  16. 16. Adding your own plugin to Telegraf Telegraf is a single GO executable - all plugins need to be closely reviewed to make sure they co-exist well within the Telegraf agent 1. Sign the CLA 2. Create an issue to describe your plugin 3. Submit a PR for your new plugin (following the guidelines) 4. Respond to review feedback, update your PR 5. Plugin will be added to a future Telegraf release (once review is complete)
  17. 17. New in Telegraf 1.15 – more lightweight extensions to Telegraf agent
  18. 18. External plugin – via ExecD plugin • Plugin runs in its own process • Avoid the need for review by Telegraf team • Supports the same API as an internal plugin • Can use for non-GO plugins • Can use for licensed software plugins • Can use for any type of plugin (input, output, processor, aggregator)
  19. 19. External plugin architecture
  20. 20. Sample ExecD Plugin configuration [[inputs.execd]] command = ["telegraf-smartctl", "-d", "/dev/sda"] signal = "none" restart_delay = "10s" data_format = "influx"
  21. 21. Starlark - lightweight processor plugin • Starlark (formerly known as Skylark) intended for use as a configuration language • Starlark is a dialect of Python • Write your script and Starlark plugin will execute it • Execution cannot access file system, network, system resources
  22. 22. Sample Starlark Plugin configuration [[processors.starlark]] source = ''' def apply(metric): for k, v in metric.fields.items(): if type(v) == "float": metric.fields[k] = v * 10 return metric '''
  23. 23. Flux .from()
  24. 24. Flux.from(): getting data from multiple sources • influxdb.from() • csv.from() • sql.from() • socket.from() • prometheus.scrape() • http.get() • bigtable.from() (experimental)
  25. 25. Telegraf Flux or Telegraf? It depends. InfluxDB Flux. from Telegraf Telegraf Telegraf Telegraf Data sources Data at the edge Data at the edge Data at the edge Data at the edge Data at the edge
  26. 26. In summary Use any combination of: • Telegraf plugins • Client APIs • Native generation of line protocol • Flux.from()
  27. 27. Thank you.

×