Successfully reported this slideshow.
Your SlideShare is downloading. ×

Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Using Telegraf and Flux | InfluxDays Virtual Experience NA 2020

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 30 Ad
Advertisement

More Related Content

Slideshows for you (20)

Similar to Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Using Telegraf and Flux | InfluxDays Virtual Experience NA 2020 (20)

Advertisement

More from InfluxData (20)

Recently uploaded (20)

Advertisement

Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Using Telegraf and Flux | InfluxDays Virtual Experience NA 2020

  1. 1. Samantha Wang Product Manager, InfluxData Best Practices on How to Transform Your Data Using Telegraf and Flux
  2. 2. © 2020 InfluxData. All rights reserved. 2 Why do we transform data? Problems: ● Resource-intensive. ● Lack of expertise and carelessness ● Better-organization ● Facilitates compatibility between applications, systems, and types of data
  3. 3. © 2020 InfluxData. All rights reserved. 3 Goals: 1. How to transform your data to fit the optimal state of your monitoring environment 2. Tools for transformation BEFORE data ingestion a. The power of Telegraf processors and aggregators 3. Tools for transformation AFTER data ingestion a. The power of Flux transformation queries
  4. 4. Basic data Concepts of InfluxDB
  5. 5. © 2020 InfluxData. All rights reserved. 5 What you’re collecting Data ingestion Database!! Visualizations Analytics transform! transform!
  6. 6. © 2020 InfluxData. All rights reserved. 6
  7. 7. © 2020 InfluxData. All rights reserved. 7 weather,location=us-midwest temperature=82,humidity=71 1465839830100400200 InfluxDB Line Protocol tag(s) field(s) timestamp weather location=us-midwest temperature=82,humidity=71
  8. 8. BEFORE load: Transforming data with Telegraf
  9. 9. © 2020 InfluxData. All rights reserved. 9 Telegraf CPU Mem Disk Docker Kubernetes /metrics Kafka MySQL Process - transform - decorate - filter Aggregate - mean - min,max - count - variance - stddev InfluxDB File Kafka CloudWatch CloudWatch Input Output
  10. 10. © 2020 InfluxData. All rights reserved. 10 Why manipulate data with Telegraf? ● Convert tags ↔ fields or data types ● Rename field names or values ● Aggregate data ● Clean up or filter data ● Perform data transformation as close to the device ● Improve performance
  11. 11. © 2020 InfluxData. All rights reserved. 11 Convert Data: [[processors.converter]] [processors.converter.tags] measurement = [] string = [] integer = [] unsigned = [] boolean = [] float = [] [processors.converter.fields] measurement = [] tag = [] string = [] processors.converter
  12. 12. © 2020 InfluxData. All rights reserved. 12 Convert tags ↔ fields Convert port tag to a string field: [[processors.converter]] [processors.converter.tags] string = ["port"] - apache,port=80,server=debian-stretch-apache BytesPerReq=0 + apache,server=debian-stretch-apache port="80",BytesPerReq=0 [[processors.converter]] [processors.converter.fields] tag = ["port"] - apache,server=debian-stretch-apache port="80",BytesPerReq=0 + apache,port=80,server=debian-stretch-apache BytesPerReq=0 Convert port field to a tag:
  13. 13. © 2020 InfluxData. All rights reserved. 13 Convert data types Convert all scboard_* fields to an integer: [[processors.converter]] [processors.converter.fields] integer = ["scboard_*"] - apache scboard_open=100,scboard_reading=0,scboard_sending=1,scboard_starting=0,scboard_waiting=49 + apache scboard_open=100i,scboard_reading=0i,scboard_sending=1i,scboard_starting=0i,scboard_waiting=49i
  14. 14. © 2020 InfluxData. All rights reserved. 14 Mapping: [[processors.enum]] [[processors.enum.mapping]] tag = "StorageType" dest = "StorageName" [processors.enum.mapping.value_mappings] ".1.3.6.1.2.1.25.2.1.1" = "Other" ".1.3.6.1.2.1.25.2.1.2" = "RAM" ".1.3.6.1.2.1.25.2.1.3" = "Virtual Memory" ".1.3.6.1.2.1.25.2.1.4" = "Fixed Disk" ".1.3.6.1.2.1.25.2.1.5" = "Removable Disk" ".1.3.6.1.2.1.25.2.1.6" = "Floppy Disk" ".1.3.6.1.2.1.25.2.1.7" = "Compact Disc" ".1.3.6.1.2.1.25.2.1.8" = "RAM Disk" - snmp StorageType=".1.3.6.1.2.1.25.2.1.6" 1502489900000000000 + snmp StorageName="Floppy Disk",StorageType=".1.3.6.1.2.1.25.2.1.6" 1502489900000000000 Enum Processor: ● value mappings for metric tags or fields ● Common use case is to map status codes processors.enum
  15. 15. © 2020 InfluxData. All rights reserved. 15 Transform tags and fields: Regex Processor: ● Change tags and manipulate field values with regex patterns ● Can replace existing field OR produce new tags and fields while maintaining existing ones processors.regex
  16. 16. © 2020 InfluxData. All rights reserved. 16 - nginx,resp_code=200 request="/api/search/?category=plugins&q=regex&sort=asc",referrer="-",ident="- ",http_version=1.1,client_ip="127.0.0.1" 1519652321000000000 + nginx,resp_code=2xx request="/api/search/?category=plugins&q=regex&sort=asc",method="/search/",search_category="p lugins",referrer="-",ident="-",http_version=1.1,client_ip="127.0.0.1" 1519652321000000000 [[processors.regex]] namepass = ["nginx"] [[processors.regex.tags]] key = "resp_code" pattern = "^(d)dd$" replacement = "${1}xx" [[processors.regex.fields]] key = "request" pattern = "^/api(?P<method>/[w/]+)S*" replacement = "${method}" result_key = "method" [[processors.regex.fields]] key = "request" pattern = ".*category=(w+).*" replacement = "${1}" result_key = "search_category"
  17. 17. © 2020 InfluxData. All rights reserved. 17 Math & Logic Operations: ● Dialect of Python ● Starlark specification (google/starlark-go) has details about the syntax and available functions ● Use Cases: ○ Math operations ○ Logic operations ○ Some string operations ○ Add or remove tags/fields/metrics [[processors.starlark]] ## Source of the Starlark script. source = ''' def apply(metric): return metric ''' ## File containing a Starlark script. # script = "/usr/local/bin/myscript.star" processors.starlark
  18. 18. © 2020 InfluxData. All rights reserved. 18 Basic Math - IoT Example w/ Modbus Plugin: Calculate power from voltage & current P = I * V [[processors.starlark]] source = ''' def apply(metric): I = metric.fields['current'] V = metric.fields['voltage'] metric.fields['power'] = I * V return metric ''' - modbus.InputRegisters,host=localhost current=550,frequency=60,voltage=12.93223 1554079521000000000 + modbus.InputRegisters,host=localhost current=550,frequency=60,voltage=12.93223,power=7112.7265 1554079521000000000
  19. 19. © 2020 InfluxData. All rights reserved. 19 Other useful transformations Clone - creates a copy of each metric to preserve the original metric and allow modifications in the copied metric Date - adds the metric timestamp as a human readable tag Dedup - filters metrics whose field values are exact repetitions of the previous values. Ifname - processor plugin looks up network interface names using SNMP. Parser - defined fields containing the specified data format and creates new metrics based on the contents of the field Pivot - rotate single valued metrics into a multi field metric Rename - renames InfluxDB measurements, fields, and tags Strings - maps certain Go string functions onto InfluxDB measurement, tag, and field values TopK - filter designed to get the top series over a period of time
  20. 20. © 2020 InfluxData. All rights reserved. 20 And aggregators! BasicStats - count, max, min, mean, s2(variance), and stdev for a set of values, emitting the aggregate every period seconds Final - emits the last metric of a contiguous series Merge - merges metrics together and generates line protocol with multiple fields per line MinMax - aggregates min and max values of each field it sees ValueCounter - counts the occurrence of values in fields and emits the counter once every ‘period’ seconds
  21. 21. AFTER load: Transforming data with Flux
  22. 22. © 2020 InfluxData. All rights reserved. 22 Why Flux? InfluxData’s functional data scripting language designed for querying, analyzing, and acting on data Flux allows for a lot of capabilities you wouldn’t get with InfluxQL https://docs.influxdata.com/influxdb/v2.0/query -data/get-started/
  23. 23. © 2020 InfluxData. All rights reserved. 23 map() functions Map operates on a row one at a time... map(fn: (r) => ({ _value: r._value * r._value })) ...re-maps the row record with what it’s operating unless you use with map(fn: (r) => ({ r with newColumn: r._value * 2 })) Things to do with map(): - Basic Mathematical operations - Conditional Logic - Adding new columns and tables
  24. 24. © 2020 InfluxData. All rights reserved. 24 Mathematical Operations import "math" Flux Math package result Basic Math math.pi * 6.12 ^ 2.0 117.66646788461355 Basic Math r._value + r._value 2 * r._value Round math.floor(x: 1.22) 1 Trig math.cos(x: 3.14) -0.9999987317275396 |> map(fn: (r) => ({r with pizza: math.pi * r._value ^ 2 }))
  25. 25. © 2020 InfluxData. All rights reserved. 25 String manipulations and data shaping Flux String package result Concatenation strings.joinStr(arr: ["rain on me", "rain", "rain"], v: ",") "rain on me, rain, rain" Splitting strings.split(v: "rain on me", t: " ") ["rain", "on", "me"] Substring strings.substring(v: "Lady Gaga", start: 5, end: 8) “Gaga” Case conversion strings.toUpper(v: "chromatica") “CHROMATICA” Searching strings.containsStr(v: "it’s coming down on me", substr: "down") strings.countStr(v: "rain gain pain", substr: "ain") true 3 import "strings" |> map(fn: (r) => ({r with content: strings.replace(v: r.content, t: "Mariah Carey", u: "Lady Gaga", i: 3)}))
  26. 26. © 2020 InfluxData. All rights reserved. 26 Conditional logic: if/then/else ● Conditionally transform column values with map() if r._value >= 29 then "playoffs!!!" else if r._value < 29 and r._value >= 26 then "so close" else "better luck next year"
  27. 27. © 2020 InfluxData. All rights reserved. 27 League Division Name _field _value AL West Angels Wins 26 AL West Astros Wins 29 AL West Athletics Wins 36 AL East Blue Jays Wins 32 AL Central Indians Wins 35 AL West Mariners Wins 27 AL East Orioles Wins 25 AL West Rangers Wins 22 AL East Rays Wins 40 AL East Red Sox Wins 24 AL Central Royals Wins 26 AL Central Tigers Wins 23 AL Central Twins Wins 36 AL Central White Sox Wins 35 AL East Yankees Wins 33 Conditional logic: if/then/else |> map(fn: (r) => ({ r with postseason: if r._value >= 29 then "playoffs!!!" else if r._value < 29 and r._value >= 26 then "so close" else "better luck next year" }) ) postseason so close playoffs!!! playoffs!!! playoffs!!! playoffs!!! so close better luck next year better luck next year playoffs!!! better luck next year so close better luck next year playoffs!!! playoffs!!! playoffs!!!
  28. 28. Telegraf and Flux both sound super great!
  29. 29. © 2020 InfluxData. All rights reserved. 29 When to do something in Telegraf or Flux? Telegraf: ● Tag/field manipulation for data Routing ● Cleaning up data ● Permanent transformations to improve performance on the query side Flux: ● Optimizing data for query and analysis ● Creating new columns or tables ● Flexibility
  30. 30. © 2020 InfluxData. All rights reserved. 30 Thank You!! Telegraf Processors: https://docs.influxdata.com/telegraf/latest/plugins/ Flux docs: https://docs.influxdata.com/flux

×