Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fluentd v0.14 Plugin API Details

19,249 views

Published on

Overview and details about changes of Fluentd v0.14 Plugin APIs

Published in: Software
  • Be the first to comment

Fluentd v0.14 Plugin API Details

  1. 1. Fluentd v0.14 Plugin API Details Fluentd meetup 2016 Summer Jun 1, 2016 Satoshi "Moris" Tagomori (@tagomoris)
  2. 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  3. 3. Topics • Why Fluentd v0.14 has a new API set for plugins • Compatibility of v0.12 plugins/configurations • Plugin APIs: Input, Filter, Output & Buffer • Storage Plugin, Plugin Helpers • New Test Drivers for plugins • Plans for v0.14.x & v1
  4. 4. Why Fluentd v0.14 has a New API set for plugins?
  5. 5. Fluentd v0.12 Plugins • No supports to write plugins by Fluentd core • plugins creates threads, sockets, timers and event loops • writing tests is very hard and messy with sleeps • Fragmented implementations • Output, BufferedOutput, ObjectBufferedOutput and TimeSlicedOutput • Mixture of configuration parameters from output&buffer • Uncontrolled plugin instance lifecycle (no "super" in start/shutdown) • Imperfect buffering control and useless configurations • the reason why fluent-plugin-forest exists and be used widely
  6. 6. Fluentd v0.12 Plugins • Insufficient buffer chunking control • only by size, without number of events in chunks • Forcedly synchronized buffer flushing • no way to flush-and-commit chunks asynchronously • Ultimate freedom for using mix-ins • everything overrides Plugin#emit ... (the only one entry point for events to plugins) • no valid hook points to get metrics or something else • Bad Ruby coding rules and practices • too many classes at "Fluent::*" in fluent/plugin, no "require", ...
  7. 7. And many others!
  8. 8. Compatibility of v0.12 plugins/configurations
  9. 9. Compatibility of plugins • v0.12 plugins are subclass of Fluent::* • Fluent::Input, Fluent::Filter, Fluent::Output, ... • Compatibility layers for v0.12 plugins in v0.14 • Fluent::Compat::Klass -> Fluent::Klass (e.g., Input, Output, ...) • it provides transformation of: • namespaces, configuration parameters • internal APIs, argument objects • IT SHOULD WORK, except for :P • 3rd party buffer plugin, part of test code • "Engine.emit"
  10. 10. Compatibility of configurations • v0.14 plugins have another set of parameters • many old-fashioned parameters are removed • "buffer_type", "num_threads", "timezone", "time_slice_format", "buffer_chunk_limit", "buffer_queue_limit", ... • Plugin helper "compat_parameters" • transform parameters between v0.12 style configuration and v0.14 plugin v0.12 v0.14 convert internally
  11. 11. FAQ: Can we create plugins like this? * it uses v0.14 API * it runs on Fluentd v0.12 Impossible :P
  12. 12. Overview of v0.14 Plugin classes
  13. 13. v0.14 plugin classes • All files MUST be in `fluent/plugin/*.rb` (in gems) • or just a "*.rb" file in directory specified by "-r" • All classes MUST be under Fluent::Plugin • All plugins MUST be subclasses of Fluent::Plugin::Base • All plugins MUST call `super` in methods overriding default implementation (e.g., #configure, #start, #shutdown, ...)
  14. 14. Classes hierarchy (v0.12) Fluent::Input F::Filter F::Output BufferedOutput Object Buffered Time Sliced Multi Output F::Buffer F::Parser F::Formatter 3rd party plugins
  15. 15. Classes hierarchy (v0.14) F::P::Input F::P::Filter F::P::Output Fluent::Plugin::Base F::P::Buffer F::P::Parser F::P::Formatter F::P::Storage both of buffered/non-buffered F::P:: BareOutput (not for 3rd party plugins) F::P:: MultiOutput copy roundrobin
  16. 16. Tour of New Plugin APIs: Fluent::Plugin::Input
  17. 17. Fluent::Plugin::Input • Nothing changed :) • except for overall rules • But it's much easier
 to write plugins
 than v0.12 :) • fetch HTTP resource per specified interval • parse response body with format specified in config • emit parse result
  18. 18. Fluent::Plugin::Input
  19. 19. Tour of New Plugin APIs: Fluent::Plugin::Filter
  20. 20. Fluent::Plugin::Filter • Almost nothing changed :) • Required:
 #filter(tag, time, record)
 #=> record | nil • Optional:
 #filter_stream(tag, es)
 #=> event_stream
  21. 21. Tour of New Plugin APIs: Fluent::Plugin::Output
  22. 22. Fluent::Plugin::Output • Many things changed! • Merged Output, BufferedOutput, ObjectBufferedOutput, TimeSlicedOutput • Output plugins can be • with buffering • without buffering • both (do/doesn't buffering by configuration) • Buffers chunks events by: • byte size, interval, tag • number of records (new!) • time (by any unit(new!): 30s, 5m, 15m, 3h, ...) • any specified field in records (new!) • any combination of above (new!)
  23. 23. Variations of buffering NO MORE forest plugin!
  24. 24. Output Plugin: Methods to be implemented • Non-buffered: #process(tag, es) • Buffered synchronous: #write(chunk) • Buffered Asynchronous: #try_write(chunk) • New feature for destinations with huge latency to write chunks • Plugins must call #commit_write(chunk_id) (otherwise, #try_write will be retried) • Buffered w/ custom format: #format(tag, time, record) • Without this method, output uses standard format
  25. 25. implement? #process implement? #process or #write or #try_write NO error YES #prefer_buffered_processing called (default true) NO non-buffered YES exists? <buffer> section YES implement? #write or #try_write error NO YES implement? #write or #try_write NO NO YES false implement? #write and #try_write YES #prefer_delayed_commit called (default true) implement? #try_write sync buffered async buffered
  26. 26. In other words :P • If users configure "<buffer>" section • plugin try to do buffering • Else if plugin implements both (buffering/non-buf) • plugin call #prefer_buffer_processing to decide • Else plugin does as implemented • When plugin does buffering
 If plugin implements both (sync/async write) • plugin call #prefer_delayed_commit to decide • Else plugin does as implemented
  27. 27. Delayed commit (1) • high latency #write operations locks a flush thread for long time
 (e.g., ACK in forward) destination w/ high latency #write Output Plugin send data send ACK return #write a flush thread locked
  28. 28. Delayed commit (2) • #try_write & delayed #commit_write destination w/ high latency #try_write Output Plugin send data send ACK return #try_write async check thread #commit_write
  29. 29. Use cases: delayed commit • Forward protocol w/ ACK • Distributed file systems or databases • put data -> confirm to read data -> commit • Submit tasks to job queues • submit a job -> detect executed -> commit
  30. 30. Standard chunk format • Buffering w/o #format method • Almost same with ObjectBufferedOutput • No need to implement #format always • Implement it for performance/low-latency • Tool to dump & read buffer chunks on disk w/ standard format • To be implemented in v0.14.x :)
  31. 31. <buffer CHUNK_KEYS> • comma-separated tag, time or ANY_KEYS • Nothing specified: all events are in same chunk • flushed when chunk is full • (optional) "flush_interval" after first event in chunk • tag: events w/ same tag are in same chunks • time: buffer chunks will be split by timekey • timekey: unit of time to be chunked (1m, 15m, 3h, ...) • flushed after expiration of timekey unit + timekey_wait • ANY_KEYS: any key names in records
  32. 32. • comma-separated tag, time or ANY_KEYS • Nothing specified: all events are in same chunk • flushed when chunk is full • (optional) "flush_interval" after first event in chunk • tag: events w/ same tag are in same chunks • time: buffer chunks will be split by timekey • timekey: unit of time to be chunked (1m, 15m, 3h, ...) • flushed after expiration of timekey unit + timekey_wait • ANY_KEYS: any key names in records <buffer CHUNK_KEYS> BufferedOutput TimeSlicedOutput ObjectBufferedOutput in v0.12 in v0.12 in v0.12
  33. 33. configurations:
 flushing buffers • flush_mode: lazy, interval, immediate • default: lazy if "time" specified, otherwise interval • flush_interval, flush_thread_count • flush_thread_count: number of threads for flushing • delayed_commit_timeout • output plugin will retry #try_write when expires
  34. 34. Retries, Secondary • Explicit timeout for retries: • retry_timeout: timeout not to retry anymore • retry_max_times: how many times to retry • retry_type: "periodic" w/ fixed retry_wait • retry_secondary_threshold (percentage) • output will use secondary if specified percentage of retry_timeout elapsed after first error
  35. 35. Buffer parameters • chunk_limit_size • maximum bytesize per chunks • chunk_records_limit (default: not specified) • maximum number of records per chunks • total_limit_size • maximum bytesize which a buffer plugin can use • (optional) queue_length_limit: no need to specify
  36. 36. Chunk metadata • Stores various information of buffer chunks • key-values of chunking unit • number of records • created_at, modified_at • `chunk.metadata` • extract_placeholders(@path, chunk.metadata)
  37. 37. Tour of New Plugin APIs: Other plugin types
  38. 38. Classes hierarchy (v0.14) F::P::Input F::P::Filter F::P::Output Fluent::Plugin::Base F::P::Buffer F::P::Parser F::P::Formatter F::P::Storage both of buffered/non-buffered F::P:: BareOutput (not for 3rd party plugins) F::P:: MultiOutput copy roundrobin
  39. 39. Classes hierarchy (v0.14) F::P::Input F::P::Filter F::P::Output Fluent::Plugin::Base F::P::Buffer F::P::Parser F::P::Formatter F::P::Storage both of buffered/non-buffered F::P:: BareOutput (not for 3rd party plugins) F::P:: MultiOutput copy roundrobin"Owned" plugins
  40. 40. "Owned" plugins • Primary plugins: Input, Output, Filter • Instantiated by Fluentd core • "Owned" plugins are owned by primary plugins • Buffer, Parser, Formatter, Storage, ... • It can refer owner's plugin id, logger, ... • Fluent::Plugin.new_xxx("kind", parent:@input) • "Owned" plugins can be configured by owner plugins
  41. 41. Owner plugins can control defaults of owned plugins Fluentd provides standard way to configure owned plugins
  42. 42. Tour of New Plugin APIs: Fluent::Plugin::Storage
  43. 43. Storage plugins • Pluggable Key-Value store for plugins • configurable: autosave, persistent, save_at_shutdown • get, fetch, put, delete, update (transactional) • Various possible implementations • built-in: local (json) on-disk / on-memory • possible: Redis, Consul,
 or whatever supports serialize/deserialize of json-like object • To store states of plugins: • counter values of data-counter plugin • pos data of file plugin • To load configuration dynamically for plugins: • load configurations from any file systems
  44. 44. Tour of New Plugin APIs: Plugin Helpers
  45. 45. Plugin Helpers • No more mixin! • declare to use helpers by "helpers :name" • Utility functions to support difficult things • creating threads, timers, child processes... • created timers will be stopped automatically in plugin's shutdown sequence • Integrated w/ New Test Drivers • tests runs after helpers started everything requested
  46. 46. Plugin Helpers Example • Thread: thread_create, thread_current_running? • Timer: timer_execute • ChildProcess: child_process_execute • command, arguments, subprocess_name, interval, immediate, parallel, mode, stderr, env, unsetenv, chdir, ... • EventEmitter: router (Output doesn't have router in v0.14 default) • Storage: storage_create • (TBD) Socket/Server for TCP/UDP/TLS, Parser, Formatter
  47. 47. Tour of New Plugin APIs: New Test Drivers
  48. 48. New Test Drivers • Instead of old drivers Fluent::Test::*TestDriver • Fluent::Test::Driver::Input, Output or Filter • fully emulates actual plugin behavior • w/ override SystemConfig • capturing emitted events & error event streams • inserting TestLogger to capture/test logs of plugins • capturing "format" result of output plugins • controlling "flush" timing of output plugins • Running tests under control • Plugin Helper integration • conditions to keep/break running tests • timeouts, number of emits/events to stop tests • automatic start/shutdown call for plugins
  49. 49. Plans for v0.14.x
  50. 50. New Features • Symmetric multi processing • to use 2 or more CPU cores! • by sharing a configuration between all processes • "detach_process" will be deprecated • forward: TLS + authentication/authorization support • secure-forward integration • Buffer supports compression & forward it • Plugin generator & template
  51. 51. New APIs • Controlling global configuration from SystemConfig • configured via <system> tag • root buffer path + plugin id: remove paths from each buffers • process total buffer size control • Counter APIs • counting everything over processes via RPC • creating metrics for a whole fluentd cluster
  52. 52. For v1
  53. 53. v1: stable version of v0.14 • v0.12 plugins will be still supported at v1.0.0 • deprecated, and will be obsoleted at v1.x • Will be obsoleted: • v0 (traditional) configuration syntax • "detach_process" feature • Q4 2016?
  54. 54. To Be Written by me :-) • As soooooooooon as possible... • Plugin developers' guide for • Updating v0.12 plugins with v0.14 APIs • Writing plugins with v0.14 APIs • Writing tests of plugins with v0.14 APIs • Users' guide for • How to use buffering in general (w/ <buffer>) • Updated plugin documents
  55. 55. Enjoy logging!

×