Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fluentd Overview, Now and Then

4,700 views

Published on

#fluentdmeetup

Published in: Software

Fluentd Overview, Now and Then

  1. 1. Fluentd Overview, Now and Then Satoshi Tagomori (@tagomoris) Fluentd meetup in Matsue #fluentdmeetup
  2. 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  3. 3. Fluentd overview
  4. 4. What’s Fluentd? Simple core
 + Variety of plugins Buffering, HA (failover), Secondary output, etc. Like syslogd in streaming manner AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL
  5. 5. Log collection with traditional logrotate + rsync Log Server Application Server A File FileFile Hard to analyze!! Complex text parsers Application Server C File FileFile Application Server B File FileFile High latency!! Must wait for a day
  6. 6. Streaming way with Fluentd Log Server Application Server A File FileFile Application Server C File FileFile Application Server B File FileFile Low latency! Seconds or minutes Easy to analyze!! Parsed and formatted
  7. 7. M x N problem for data integration LOG script to parse data cron job for loading filtering script syslog script Tweet- fetching script aggregation script aggregation script script to parse data rsync server
  8. 8. LOG A solution: centralized log collection service M + N
  9. 9. Fluentd Architecture
  10. 10. Internal Architecture (simplified) Plugin Input Filter Buffer Output Plugin Plugin Plugin 2012-02-04 01:33:51 myapp.buylog{ “user”:”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing” } Time Tag Record
  11. 11. Architecture: Input Plugins HTTP+JSON (in_http) File tail (in_tail) Syslog (in_syslog) … Receive logs Or pull logs from data sources In non-blocking manner Plugin Input
  12. 12. Filter Architecture: Filter Plugins Transform logs Filter out unnecessary logs Enrich logs Plugin Encrypt personal data Convert IP to countries Parse User-Agent …
  13. 13. Buffer Architecture: Buffer Plugins Plugin Improve performance Provide reliability Provide thread-safety Memory (buf_memory) File (buf_file)
  14. 14. Buffer Architecture: Buffer Plugins Chunk Plugin Improve performance Provide reliability Provide thread-safety Input Output Chunk Chunk
  15. 15. Architecture: Output Plugins Output Write or send event logs Plugin File (out_file) Amazon S3 (out_s3) kafka (out_kafka_buffered) …
  16. 16. Retry Error Retry Batch Stream Error Retry Retry Divide & Conquer for retry
  17. 17. Divide & Conquer for recovery Buffer (on-disk or in-memory) Error Overloaded!! recovery recovery + flow control queued chunks
  18. 18. Example Use Cases
  19. 19. Streaming from Apache/Nginx to Elasticsearch in_tail /var/log/access.log /var/log/fluentd/buffer but_file
  20. 20. Error Handling and Recovery in_tail /var/log/access.log /var/log/fluentd/buffer but_file Buffering for any outputs Retrying automatically With exponential wait and persistence on a disk and secondary output
  21. 21. Tailing & parsing files Supported built-in formats: Read a log file Custom regexp Custom parser in Ruby • apache • apache_error • apache2 • nginx • json • csv • tsv • ltsv • syslog • multiline • none pos fileevents.log ? (your app)
  22. 22. Out to Multiple Locations Routing based on tags Copy to multiple storages buffer access.log in_tail
  23. 23. Example configuration for real time batch combo
  24. 24. Data partitioning by time on HDFS / S3 access.log buffer Custom file formatter Slice files based on time 2016-01-01/01/access.log.gz 2016-01-01/02/access.log.gz 2016-01-01/03/access.log.gz … in_tail
  25. 25. 3rd party input plugins dstat df AMQL munin jvmwatcher SQL
  26. 26. 3rd party output plugins Graphite
  27. 27. Real World Use Cases
  28. 28. Microsoft Operations Management Suite uses Fluentd: "The core of the agent uses an existing open source data aggregator called Fluentd. Fluentd has hundreds of existing plugins, which will make it really easy for you to add new data sources." Syslog Linux Computer Operating System Apache MySQL Containers omsconfig (DSC) PS DSC Providers OMI Server (CIM Server) omsagent Firewall/proxy OMSService Upload Data (HTTPS) Pull configuration (HTTPS)
  29. 29. Atlassian "At Atlassian, we've been impressed by Fluentd and have chosen to use it in Atlassian Cloud's logging and analytics pipeline." Kinesis Elasticsearch cluster Ingestion service
  30. 30. Amazon web services The architecture of Fluentd (Sponsored by Treasure Data) is very similar to Apache Flume or Facebook’s Scribe. Fluentd is easier to install and maintain and has better documentation and support than Flume and Scribe. Types of DataStoreCollect Transactional • Database reads & write (OLTP) • Cache Search • Logs • Streams File • Log files (/val/log) • Log collectors & frameworks Stream • Log records • Sensors & IoT data Web Apps IoTApplicationsLogging Mobile Apps Database Search File Storage Stream Storage
  31. 31. Container and Logging
  32. 32. The Container Era Server Era Container Era Service Architecture Monolithic Microservices System Image Mutable Immutable Managed By Ops Team DevOps Team Local Data Persistent Ephemeral Log Collection syslogd / rsync ? Metrics Collection Nagios / Zabbix ?
  33. 33. Server Era Container Era Service Architecture Monolithic Microservices System Image Mutable Immutable Managed By Ops Team DevOps Team Local Data Persistent Ephemeral Log Collection syslogd / rsync ? Metrics Collection Nagios / Zabbix ? The Container Era How should log & metrics collection be done in The Container Era?
  34. 34. Problems
  35. 35. The traditional logrotate + rsync on containers Log Server Application Container A File FileFile Hard to analyze!! Complex text parsers Application Container C File FileFile Application Container B File FileFile High latency!! Must wait for a day Ephemeral!! Could be lost at any time
  36. 36. Server 1 Container A Application Container B Application Server 2 Container C Application Container D Application Kafka elasticsearch HDFS Container Container Container Container Small & many containers make storages overloaded Too many connections from micro containers!
  37. 37. Server 1 Container A Application Container B Application Server 2 Container C Application Container D Application Kafka elasticsearch HDFS Container Container Container Container System images are immutable Too many connections from micro containers! Embedding destination IPsin ALL Docker images
 makes management hard
  38. 38. How to collect logs from
 Docker containers
  39. 39. Text logging with --log-driver=fluentd Server Container App FluentdSTDOUT / STDERR docker run --log-driver=fluentd 
 --log-opt fluentd-address=localhost:24224 { “container_id”: “ad6d5d32576a”, “container_name”: “myapp”, “source”: stdout }
  40. 40. Metrics collection with fluent-logger Server Container App Fluentd from fluent import sender from fluent import event sender.setup('app.events', host='localhost') event.Event('purchase', { 'user_id': 21, 'item_id': 321, 'value': '1' }) tag = app.events.purchase { “user_id”: 21, “item_id”: 321 “value”: 1, } fluent-logger library
  41. 41. Shared data volume and tailing Server Container App Fluentd <source> @type tail path /mnt/nginx/logs/access.log pos_file /var/log/fluentd/access.log.pos format nginx tag nginx.access </source> /mnt/nginx/logs
  42. 42. Logging methods for each purpose • Collecting log messages > --log-driver=fluentd • Application metrics > fluent-logger • Access logs, logs from middleware > Shared data volume • System metrics (CPU usage, Disk capacity, etc.) > Fluentd’s input plugins
 (Fluentd pulls those data periodically)
  43. 43. Deployment Patterns
  44. 44. Server 1 Container A Application Container B Application Server 2 Container C Application Container D Application Kafka elasticsearch HDFS Container Container Container Container Primitive deployment… Too many connections from many containers! Embedding destination IPsin ALL Docker images
 makes management hard
  45. 45. Server 1 Container A Application Container B Application Fluentd Server 2 Container C Application Container D Application Fluentd Kafka elasticsearch HDFS Container Container Container Container destination is always localhost from app’s point of view Source aggregation decouples config from apps
  46. 46. Server 1 Container A Application Container B Application Fluentd Server 2 Container C Application Container D Application Fluentd active / standby / load balancing Destination aggregation makes storages scalable for high traffic Aggregation server(s)
  47. 47. Aggregation servers • Logging directly from microservices makes log storages overloaded. > Too many RX connections > Too frequent import API calls • Aggregation servers make the logging infrastracture more reliable and scalable. > Connection aggregation > Buffering for less frequent import API calls > Data persistency during downtime > Automatic retry at recovery from downtime
  48. 48. Fluentd ♡ Container • Fluentd model fits container based systems > This is why Treasure Data joined CNCF > TD wants to improve cloud native ecosystem • Fluentd, Prometheus, Docker and Kubernetes collabolation is good for modern systems • Easy to scale and easy to maintain • Fluentd logging driver in Docker • fluent-plugin-prometheus to send application metrics to prometheus • EFK for log visualization in Kubernetes
  49. 49. Fluentd v0.14 and Later
  50. 50. • v0.14.0: Released at May 31, 2016 • v0.14.1: Released at Jun 30, 2016 • New Features • New Plugin APIs, Plugin Helpers & Plugin Storage • Time with Nanosecond resolution • ServerEngine based Supervisor • Windows support v0.14
  51. 51. New Plugin APIs • Input/Output plugin APIs w/ well-controlled lifecycle • stop, shutdown, close, terminate • New Buffer API for delayed commit of chunks • parallel/async "commit" operation for chunks • 100% Compatible w/ v0.12 plugins • compatibility layer for traditional APIs • it will be supported between v1.x versions
  52. 52. Router buffer_chunk_limit enqueue: exceed flush_interval or buffer_chunk_limit Key pattern: - BufferedOutput empty string or specified key -ObjectBufferedOutput tag -TimeSlicedOutput time slice emit emit Buffer Queue buffer_queue_limit Output OutputInput / Filter Tag Time Record Chunk Chunk Chunk Chunk Chunk key:foo key:bar key:baz v0.12 buffer design
  53. 53. v0.14 buffer design
  54. 54. Plugin Storage & Helpers • Plugin Storage: new plugin type for plugins • provides key-value storage for plugins • to persistent intermediate status of plugins • built-in plugins (in plan): in-memory, local file • pluggable: 3rd party plugin to store data to Redis? • Plugin Helpers: • collections of utility methods for plugins • making threads, sockets, network servers, ... • fully integrated with test drivers to run test codes after setup phase of helpers (e.g., after created threads started)
  55. 55. v0.12 plugins ParserInput Buffer Output FormatteFilter “output-ish”“input-ish”
  56. 56. v0.14 plugins ParserInput Buffer Output FormatteFilter “output-ish”“input-ish” Storag Helper
  57. 57. Time with nanosecond • For sub-second systems: Elasticsearch, InfluxData and etc • Fluent::EventTime • behaves as Integer (used as time in v0.12) • has methods to get sub-second resolution • be serialized into msgpack using Ext type • Fluentd core can handle both of Integer and EventTime as time • compatible with older versions and software in eco- system (e.g., fluent-logger, Docker logging driver)
  58. 58. ServerEngine based Supervisor • Replacing supervisor process with ServerEngine • it has SocketManager to share listening sockets between 2 or more worker processes • Replacing Fluentd's processing model from fork to spawn • to support Windows environment
  59. 59. Windows support • Fluentd and core plugin work on Windows • several companies have already used
 v0.14.0.pre version on production • We will send a patch to popular plugins if
 it doesn’t work on Windows • Use HTTP RPC instead of signals
  60. 60. v0.14.x - v1 • v0.14.x (some versions in 2016) • Symmetric multi-core processing • Counter API • TLS/authentication/authorization support (merging secure forward) • https://github.com/fluent/fluentd/issues/1000 • v1 (4Q in 2016 or 1Q in 2017) • Stable version for new APIs / features • Fully compatible with v0.12 • exclude v0 config syntax and detach_process
  61. 61. Symmetric multi core processing • 2 or more workers share a configuration file • and share listening sockets via PluginHelper • under a supervisor process (ServerEngine) • Multi core scalability for huge traffic • one input plugin for a tcp port, some filters and one (or some) output plugin • buffer paths are managed automatically by Fluentd core
  62. 62. Worker Supervisor Worker Worker Worker Supervisor Worker Worker Supervisor Supervisor Using fluent-plugin-multiprocess v0.14
  63. 63. Counter API • APIs to increment/decrement values • shared by some processes • persisted on disk backed by Storage API • Useful for collecting metrics or stats filters
  64. 64. TLS/Authn/Authz support for forward plugin • secure-forward will be merged into built-in forward • TLS w/ at-least-one semantics • Simple authentication/authorization w/ non-SSL forwarding • Authentication and Authorization providers • Who can connect to input plugins?
 What tags are permitted for clients? • New plugin types (3rd party authors can write it) • Mainly for in/out forward, but available from others
  65. 65. Benchmark (1 CPU usage) 100,000msgs/sec v0.14 v0.12 in_tail (none) + out_forward 70% 66% in_forward + flowcounter_simple 11% 11% in_forward + tdlog 43% 38% ※ Use EC2 c3.8xlarge ※ Not fully optimized yet
  66. 66. Treasure Agent 3.0 (td-agent 3) • fluentd v0.14 • Ruby 2.3 and latest core components • Environments • Add msi Windows package • Remove CentOS 5, Ubuntu 10.04 support • Release date is not fixed…
  67. 67. Enjoy logging!
  68. 68. H.A. configuration (high availability) Retry automatically Exponential retry wait Persistent on a disk buffer Automatic fail-over Load balancing access.log in_tail

×