Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fluentd 101

691 views

Published on

Introduction about Fluentd, version 2017
Open Source Summit Japan 2017 #OSSummit

Published in: Software
  • Be the first to comment

Fluentd 101

  1. 1. FLUENTD 101 BOOTSTRAP OF UNIFIED LOGGING Open Source Summit Japan 2017 Fluentd Mini Summit / June 1, 2017 Satoshi Tagomori (@tagomoris) Treasure Data, Inc.
  2. 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  3. 3. What is Fluentd?
  4. 4. "Fluentd is an open source data collector for unified logging layer." https://www.fluentd.org/
  5. 5. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL
  6. 6. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL Unified Logging Layer
  7. 7. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL
  8. 8. "Unified Logging Layer" ? "Fluentd decouples data sources from backend systems by providing a unified logging layer in between." SQL
  9. 9. AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL Simple Core w/ Plugin System + Various Plugins Buffering, Retries, Failover
  10. 10. # logs from a file <source> @type tail path /var/log/httpd.log pos_file /tmp/pos_file tag web.access <parse> @type apache2 </parse> </source> # logs from client libraries <source> @type forward port 24224 </source> # store logs to ES and HDFS <match web.*> @type copy <store> @type elasticsearch logstash_format true </store> <store> @type webhdfs host namenode.local port 50070 path /path/on/hdfs <format> @type json </format> </store> </match> Configuration (v0.14)
  11. 11. Implementation / Performance • Fluentd is written in Ruby • C extension libraries for performance requirement • cool.io: Asynchronous I/O • msgpack: Serialization/Deserialization for MessagePack • Plugins in Ruby - easy to join the community • Scaling for CPU cores • multiprocess plugin (~ v0.12) • multi process workers (v0.14 ~)
  12. 12. Package / Deployment • Fluentd released on RubyGems.org • rpm/deb package • td-agent by Treasure Data • Fluentd + some widely-used plugins • td-agent2: Fluentd v0.12 • td-agent3 (beta now): Fluentd v0.14 (or v1) • + msi package for Windows • Docker images • https://hub.docker.com/r/fluent/fluentd/
  13. 13. For Users in the Enterprise Sector More security features and some others https://fluentd.treasuredata.com/
  14. 14. Plugin System
  15. 15. 3rd party input plugins dstat df AMQL munin jvmwatcher SQL
  16. 16. 3rd party output plugins Graphite
  17. 17. Buffer OutputParserInput FormatterFilter “output-ish”“input-ish” “output-ish”“input-ish” Storage Helper Buffer OutputParserInput FormatterFilter Fluentd v0.14 Fluentd v0.12
  18. 18. Plugins • Built-in plugins • tail, forward, file, exec, exec_filter, copy, ... • 3rd party plugins (755 plugins at May 19) • fluent-plugin-xxx via rubygems.org • webhdfs, kafka, elasticsearch, redshift, bigquery, ... • Plugin script • .rb script files on /etc/fluent/plugin
  19. 19. Ecosystem
  20. 20. Events
  21. 21. Events: Structured Logs ec_service.shopping_cart 2017-03-30 16:35:37 +0100 { "container_id": "bfdd5b9....", "container_name": "/infallible_mayer", "source": "stdout", "event": "put an item to cart", "item_id": 101, "items": 10, "client": "web" } tag timestamp record
  22. 22. Event: tag, timestamp and record • Tag • A dot-separated string • to show what the event is / where the event from • Timestamp • An integer of unix time (~ v0.12) • A structured timestamp with nano seconds (v0.14 ~) • Record • Key-value pairs
  23. 23. Data Source router input plugin read / receive raw data eventeventeventevent parser plugin parse data into key-values parse timestamp from record add tags output plugin with buffering eventeventeventevent format plugin buffer plugin formatted data Data Destination write / send
  24. 24. Buffers and Retries
  25. 25. Buffer & Retry for Micro-Try&Error Retry Retry Batch Stream Error Retry Retry
  26. 26. Controlled Recovery from Long Outage Buffer (on-disk or in-memory) Error Overloaded!! recovery recovery + flow control queued chunks
  27. 27. Last Resort: Secondary Output Error queued chunks
  28. 28. # store logs to ES, or file if it goes down <match web.*> @type elasticsearch logstash_format true <secondary> @type secondary_file path /data/backup/web </secondary> </match> Configuration: Secondary Output (v0.14)
  29. 29. Forwarding Data via Network
  30. 30. Forward Plugin: Forwarding Data via Network • Built-in plugin, TCP port 24224 (in default) • with heartbeat protocol (default UDP in v0.12, TCP/TLS in v0.14) • Transfer data via TCP • From Fluentd to Fluentd • From logger libraries to Fluentd • From Fluent-bit to Fluentd • From Docker logging driver to Fluentd • Standard protocol
 https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1
 (Spec v0 at Fluentd v0.12 or earlier -> Spec v1 at Fluentd 0.14)
  31. 31. Forward: Features • Load Balancing / High Availability • Servers with weights • Standby servers • DNS round robin • Controlling Data Transferring Semantics • at-most-once / at-least-once transferring • Spec v1 Features • TLS support • Simple authentication/authorization • Efficient data transferring: Gzip Compression (v0.14)
  32. 32. Forward Plugin: Load Balancing 60 60 60 60 60 60 60 20 weight: 60 (default) weight: 60 (default) weight: 60 (default) weight: 20 Balance Data with Configured Weight
  33. 33. Forward Plugin: Handling Server Failure Detecting Server Down/Up using Heartbeat Error
  34. 34. Forward Plugin: Standby Server Detecting Server Down/Up using Heartbeat standby: true Error standby: true
  35. 35. Forward Plugin: Without ACK (at-most-once) forward output buffer forward input any output forward output forward input any output bufferbuffer Without any troubles With troubles about buffers in destination forward output buffer forward input any output forward output forward input any output buffer buffer Events will be lost in this case :(
  36. 36. Forward Plugin: With ACK (at-least-once) forward output buffer forward input any output forward output forward input any output buffer Forward output: require_ack_response: true forward output forward input any output buffer buffer buffer ACK with chunk id forward output forward input any output buffer buffer with chunk id
  37. 37. Forward Plugin: With ACK (at-least-once) forward output buffer forward input any output forward output forward input any output Forward output: require_ack_response: true forward output forward input any output buffer buffer ACK missing forward output forward input any output with chunk id buffer buffer with chunk id retry Forward output ensure to transfer buffers to any living destinations :D
  38. 38. Fluentd has many good stuffs
 for logging. Discover more on docs.fluentd.org! Happy Logging! @tagomoris

×