SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
3.
Topics
• Why Fluentd v0.14 has a new API set for plugins
• Compatibility of v0.12 plugins/configurations
• Plugin APIs: Input, Filter, Output & Buffer
• Storage Plugin, Plugin Helpers
• New Test Drivers for plugins
• Plans for v0.14.x & v1
4.
Why Fluentd v0.14
has a New API set for plugins?
5.
Fluentd v0.12 Plugins
• No supports to write plugins by Fluentd core
• plugins creates threads, sockets, timers and event loops
• writing tests is very hard and messy with sleeps
• Fragmented implementations
• Output, BufferedOutput, ObjectBufferedOutput and TimeSlicedOutput
• Mixture of configuration parameters from output&buffer
• Uncontrolled plugin instance lifecycle (no "super" in start/shutdown)
• Imperfect buffering control and useless configurations
• the reason why fluent-plugin-forest exists and be used widely
6.
Fluentd v0.12 Plugins
• Insufficient buffer chunking control
• only by size, without number of events in chunks
• Forcedly synchronized buffer flushing
• no way to flush-and-commit chunks asynchronously
• Ultimate freedom for using mix-ins
• everything overrides Plugin#emit ... (the only one entry point for
events to plugins)
• no valid hook points to get metrics or something else
• Bad Ruby coding rules and practices
• too many classes at "Fluent::*" in fluent/plugin, no "require", ...
9.
Compatibility of plugins
• v0.12 plugins are subclass of Fluent::*
• Fluent::Input, Fluent::Filter, Fluent::Output, ...
• Compatibility layers for v0.12 plugins in v0.14
• Fluent::Compat::Klass -> Fluent::Klass (e.g., Input, Output, ...)
• it provides transformation of:
• namespaces, configuration parameters
• internal APIs, argument objects
• IT SHOULD WORK, except for :P
• 3rd party buffer plugin, part of test code
• "Engine.emit"
10.
Compatibility of configurations
• v0.14 plugins have another set of parameters
• many old-fashioned parameters are removed
• "buffer_type", "num_threads", "timezone", "time_slice_format",
"buffer_chunk_limit", "buffer_queue_limit", ...
• Plugin helper "compat_parameters"
• transform parameters between v0.12 style
configuration and v0.14 plugin
v0.12 v0.14
convert
internally
11.
FAQ:
Can we create plugins like this?
* it uses v0.14 API
* it runs on Fluentd v0.12
Impossible :P
13.
v0.14 plugin classes
• All files MUST be in `fluent/plugin/*.rb` (in gems)
• or just a "*.rb" file in directory specified by "-r"
• All classes MUST be under Fluent::Plugin
• All plugins MUST be subclasses of Fluent::Plugin::Base
• All plugins MUST call `super` in methods overriding
default implementation (e.g., #configure, #start, #shutdown, ...)
14.
Classes hierarchy (v0.12)
Fluent::Input F::Filter
F::Output
BufferedOutput
Object
Buffered
Time
Sliced
Multi
Output F::Buffer
F::Parser
F::Formatter
3rd party plugins
15.
Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::Buffer
F::P::Parser
F::P::Formatter
F::P::Storage
both of
buffered/non-buffered
F::P::
BareOutput
(not for 3rd party
plugins)
F::P::
MultiOutput
copy
roundrobin
16.
Tour of New Plugin APIs:
Fluent::Plugin::Input
17.
Fluent::Plugin::Input
• Nothing changed :)
• except for overall rules
• But it's much easier
to write plugins
than v0.12 :)
• fetch HTTP resource per
specified interval
• parse response body
with format specified in
config
• emit parse result
19.
Tour of New Plugin APIs:
Fluent::Plugin::Filter
20.
Fluent::Plugin::Filter
• Almost nothing changed :)
• Required:
#filter(tag, time, record)
#=> record | nil
• Optional:
#filter_stream(tag, es)
#=> event_stream
21.
Tour of New Plugin APIs:
Fluent::Plugin::Output
22.
Fluent::Plugin::Output
• Many things changed!
• Merged Output, BufferedOutput, ObjectBufferedOutput, TimeSlicedOutput
• Output plugins can be
• with buffering
• without buffering
• both (do/doesn't buffering by configuration)
• Buffers chunks events by:
• byte size, interval, tag
• number of records (new!)
• time (by any unit(new!): 30s, 5m, 15m, 3h, ...)
• any specified field in records (new!)
• any combination of above (new!)
23.
Variations of buffering
NO MORE forest plugin!
24.
Output Plugin:
Methods to be implemented
• Non-buffered: #process(tag, es)
• Buffered synchronous: #write(chunk)
• Buffered Asynchronous: #try_write(chunk)
• New feature for destinations with huge latency to write
chunks
• Plugins must call #commit_write(chunk_id) (otherwise,
#try_write will be retried)
• Buffered w/ custom format: #format(tag, time, record)
• Without this method, output uses standard format
25.
implement?
#process
implement?
#process or #write or #try_write
NO error
YES
#prefer_buffered_processing
called (default true)
NO
non-buffered
YES
exists?
<buffer> section
YES implement?
#write or #try_write
error
NO
YES
implement?
#write or
#try_write
NO
NO
YES
false
implement?
#write and
#try_write
YES
#prefer_delayed_commit
called (default true)
implement?
#try_write
sync
buffered
async
buffered
26.
In other words :P
• If users configure "<buffer>" section
• plugin try to do buffering
• Else if plugin implements both (buffering/non-buf)
• plugin call #prefer_buffer_processing to decide
• Else plugin does as implemented
• When plugin does buffering
If plugin implements both (sync/async write)
• plugin call #prefer_delayed_commit to decide
• Else plugin does as implemented
27.
Delayed commit (1)
• high latency #write operations locks a flush thread for long time
(e.g., ACK in forward)
destination w/ high latency
#write
Output Plugin
send data send ACK
return #write
a flush thread locked
29.
Use cases: delayed commit
• Forward protocol w/ ACK
• Distributed file systems or databases
• put data -> confirm to read data -> commit
• Submit tasks to job queues
• submit a job -> detect executed -> commit
30.
Standard chunk format
• Buffering w/o #format method
• Almost same with ObjectBufferedOutput
• No need to implement #format always
• Implement it for performance/low-latency
• Tool to dump & read buffer chunks on disk w/
standard format
• To be implemented in v0.14.x :)
31.
<buffer CHUNK_KEYS>
• comma-separated tag, time or ANY_KEYS
• Nothing specified: all events are in same chunk
• flushed when chunk is full
• (optional) "flush_interval" after first event in chunk
• tag: events w/ same tag are in same chunks
• time: buffer chunks will be split by timekey
• timekey: unit of time to be chunked (1m, 15m, 3h, ...)
• flushed after expiration of timekey unit + timekey_wait
• ANY_KEYS: any key names in records
32.
• comma-separated tag, time or ANY_KEYS
• Nothing specified: all events are in same chunk
• flushed when chunk is full
• (optional) "flush_interval" after first event in chunk
• tag: events w/ same tag are in same chunks
• time: buffer chunks will be split by timekey
• timekey: unit of time to be chunked (1m, 15m, 3h, ...)
• flushed after expiration of timekey unit + timekey_wait
• ANY_KEYS: any key names in records
<buffer CHUNK_KEYS>
BufferedOutput
TimeSlicedOutput
ObjectBufferedOutput
in v0.12
in v0.12
in v0.12
33.
configurations:
flushing buffers
• flush_mode: lazy, interval, immediate
• default: lazy if "time" specified, otherwise interval
• flush_interval, flush_thread_count
• flush_thread_count: number of threads for flushing
• delayed_commit_timeout
• output plugin will retry #try_write when expires
34.
Retries, Secondary
• Explicit timeout for retries:
• retry_timeout: timeout not to retry anymore
• retry_max_times: how many times to retry
• retry_type: "periodic" w/ fixed retry_wait
• retry_secondary_threshold (percentage)
• output will use secondary if specified percentage
of retry_timeout elapsed after first error
35.
Buffer parameters
• chunk_limit_size
• maximum bytesize per chunks
• chunk_records_limit (default: not specified)
• maximum number of records per chunks
• total_limit_size
• maximum bytesize which a buffer plugin can use
• (optional) queue_length_limit: no need to specify
36.
Chunk metadata
• Stores various information of buffer chunks
• key-values of chunking unit
• number of records
• created_at, modified_at
• `chunk.metadata`
• extract_placeholders(@path, chunk.metadata)
38.
Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::Buffer
F::P::Parser
F::P::Formatter
F::P::Storage
both of
buffered/non-buffered
F::P::
BareOutput
(not for 3rd party
plugins)
F::P::
MultiOutput
copy
roundrobin
39.
Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::Buffer
F::P::Parser
F::P::Formatter
F::P::Storage
both of
buffered/non-buffered
F::P::
BareOutput
(not for 3rd party
plugins)
F::P::
MultiOutput
copy
roundrobin"Owned" plugins
40.
"Owned" plugins
• Primary plugins: Input, Output, Filter
• Instantiated by Fluentd core
• "Owned" plugins are owned by primary plugins
• Buffer, Parser, Formatter, Storage, ...
• It can refer owner's plugin id, logger, ...
• Fluent::Plugin.new_xxx("kind", parent:@input)
• "Owned" plugins can be configured by owner plugins
41.
Owner plugins can control defaults of owned plugins
Fluentd provides standard way to configure owned
plugins
42.
Tour of New Plugin APIs:
Fluent::Plugin::Storage
43.
Storage plugins
• Pluggable Key-Value store for plugins
• configurable: autosave, persistent, save_at_shutdown
• get, fetch, put, delete, update (transactional)
• Various possible implementations
• built-in: local (json) on-disk / on-memory
• possible: Redis, Consul,
or whatever supports serialize/deserialize of json-like object
• To store states of plugins:
• counter values of data-counter plugin
• pos data of file plugin
• To load configuration dynamically for plugins:
• load configurations from any file systems
45.
Plugin Helpers
• No more mixin!
• declare to use helpers by "helpers :name"
• Utility functions to support difficult things
• creating threads, timers, child processes...
• created timers will be stopped automatically in
plugin's shutdown sequence
• Integrated w/ New Test Drivers
• tests runs after helpers started everything requested
48.
New Test Drivers
• Instead of old drivers Fluent::Test::*TestDriver
• Fluent::Test::Driver::Input, Output or Filter
• fully emulates actual plugin behavior
• w/ override SystemConfig
• capturing emitted events & error event streams
• inserting TestLogger to capture/test logs of plugins
• capturing "format" result of output plugins
• controlling "flush" timing of output plugins
• Running tests under control
• Plugin Helper integration
• conditions to keep/break running tests
• timeouts, number of emits/events to stop tests
• automatic start/shutdown call for plugins
50.
New Features
• Symmetric multi processing
• to use 2 or more CPU cores!
• by sharing a configuration between all processes
• "detach_process" will be deprecated
• forward: TLS + authentication/authorization support
• secure-forward integration
• Buffer supports compression & forward it
• Plugin generator & template
51.
New APIs
• Controlling global configuration from SystemConfig
• configured via <system> tag
• root buffer path + plugin id: remove paths from
each buffers
• process total buffer size control
• Counter APIs
• counting everything over processes via RPC
• creating metrics for a whole fluentd cluster
53.
v1: stable version of v0.14
• v0.12 plugins will be still supported at v1.0.0
• deprecated, and will be obsoleted at v1.x
• Will be obsoleted:
• v0 (traditional) configuration syntax
• "detach_process" feature
• Q4 2016?
54.
To Be Written by me :-)
• As soooooooooon as possible...
• Plugin developers' guide for
• Updating v0.12 plugins with v0.14 APIs
• Writing plugins with v0.14 APIs
• Writing tests of plugins with v0.14 APIs
• Users' guide for
• How to use buffering in general (w/ <buffer>)
• Updated plugin documents