Fluentd v1 and future at techtalk

Fluentd v1 and the future
Feb 15, 2018
Masahiro Nakagawa

Fluentd v0.12
• Old stable and widely used on production
• Input, Parser, Filter, Formatter, Buffer, Output plugins
• Known issues
• Event time is second unit
• No Windows support
• No multi core support
• Need to improve plugin API to support more various
use cases`

Fluentd v0.14
• Development version of v1
• Implemented New features
• New Plugin APIs
• Event Time with Nanosecond resolution
• ServerEngine based Supervisor
• Windows support
• Multicore support
• New Plugin Helpers & Plugin Storage

New Plugin APIs
• v1.0 Annoucement at CNCon + KubeCon NA.
• Stable announcement for APIs / features
• No breaking API changes in v1.x
• Compatible with v0.12 and v0.14
• exclude v0 conﬁg syntax and detach_process
• Latest version is v1.1.0: Jan 18, 2018

New Plugin APIs
• Input/Output plugin APIs w/ well-controlled lifecycle
• stop, shutdown, close, terminate
• Integrate all output plugin into Fluent::Plugin::Output
• New Buffer API for delayed commit and ﬂexible chunking with metadata
• parallel/async "commit" operation for chunks
• For high latency case: forward’s at-least-once, issuing job, etc…
• Users can choose chunk keys by conﬁguration for dynamic parameters
• Compatible w/ v0.12 plugins
• compatibility layer for traditional APIs
• it will be supported between v1.x versions

Router
buffer_chunk_limit
enqueue: exceed flush_interval
or buffer_chunk_limit
Key pattern:
- BufferedOutput
empty string or specified key
-ObjectBufferedOutput tag
-TimeSlicedOutput time slice
emit emit
Buffer
Queue
buffer_queue_limit
Output
OutputInput / Filter
Tag Time
Record Chunk
Chunk
Chunk Chunk
Chunk
key:foo
key:bar
key:baz
v0.12 buffer design

Buffer keys and placeholders
• Dynamic parameters for table name, object path and more
• We can embed time, tag and any field with placeholder 
 
 
 
 
 
 
<match s3.**>
@type s3
aws_key_id "#{ENV['AWS_ACCESS_KEY']}"
aws_sec_key "#{ENV['AWS_SECRETA_KEY']}"
s3_bucket fluent-plugin-s3
path test/%Y/%m/${tag}/${key}/
<buffer time,tag,key>
timekey 3600
</buffer>
</match>
http://docs.fluentd.org/v1.0/articles/buffer-section
time: 2018-02-15 12:00:00 +0700
tag: “test”
record: {“key”:”hello”}
- Event sample
test/2018/2/test/hello/
- Generated “path”

Time with nanosecond
• For sub-second systems: Elasticsearch, InﬂuxData, etc…
• Fluent::EventTime
• behaves as Integer for v0.12’s second unit compatibility
• has methods to get sub-second resolution
• be serialized into msgpack using Ext type
• Fluent::Engine.now now returns EventTime, not Integer
• Fluentd core can handle both of Integer and EventTime as time
• compatible with older versions and software in eco-system
(e.g., ﬂuent-logger, Docker logging driver)

ServerEngine based Supervisor
• ServerEngine is a framework for building robust server
• https://github.com/treasure-data/serverengine
• Replacing supervisor process with ServerEngine
• it has SocketManager to share listening sockets between
2 or more worker processes
• Replacing Fluentd's processing model from fork to spawn
• to support Windows environment
• Log rotation support

Windows support
• Fluentd and core plugins work on Windows
• Windows service registration is also supported
• http://docs.fluentd.org/v1.0/articles/install-by-msi
• Use HTTP RPC instead of signals
• https://github.com/fluent/fluent-plugin-windows-eventlog
• We can collect windows eventlog :)

Symmetric multi core processing
• 2 or more workers share a configuration file
• and share listening sockets via PluginHelper
• under a supervisor process (ServerEngine)
• Multi core scalability for huge traffic
• one input plugin for a tcp port, some filters and one
(or some) output plugin
• buffer paths are managed by Fluentd core. Need
root_dir and @id parameters

Worker0
Supervisor
v1’s multi process feature
grep
forward
tdlog
Worker1 Worker2
grep
forward
tdlog
grep
forward
tdlog
socket

Configuration example
<system>
workers 2
root_dir /var/log/fluentd
</system>
<source>
@type forward
</source>
<filter pattern>
@type grep
</filter>
<match pattern>
@type tdlog
@id out_td
</match>
/var/log/fluentd/worker0/out_td/buffer/buffer.xxx.log
/var/log/fluentd/worker0/out_td/buffer/buffer.xxx.log.meta
- buf_file’s path is automatically generated
worker id
root_dir
plugin’s @id

<worker N> directive
• To execute plugins under one process
• Good for non-multiprocess supported plugins like in_tail 
 
 
 
 
 
 
 
in_tail/out_s3 works under worker 0
in_forward/out_kafka works 
under multiprocess environment with 
worker 1, worker 2, and worker 3
<worker 0>
<source>
@type in_tail
</source>
<match pattern>
@type s3
</match>
</worker>
<system>
workers 4
</system>
<source>
@type forward
</source>
<match pattern>
@type mongo
</match>

TLS/Authn/Authz support for forward plugin
• Support v1 forward protocol spec
• secure-forward is merged into built-in forward
• TLS w/ at-least-one semantics
• Simple authentication/authorization w/o SSL
• Different points
• secure-forward uses keep-alive, but forward doesn’t
• secure-forward uses thread per connection, but
forward uses cool.io, libev based IO.
http://www.ﬂuentd.org/blog/ﬂuentd-v0.14.12-has-been-released

Plugin Storage & Helpers
• Plugin Storage: new plugin type for plugins
• provides key-value storage to persistent intermediate status
• built-in plugins: in-memory, local ﬁle
• pluggable: 3rd party plugin to store data into storage
• storage-redis, storage-memcached
• Plugin Helpers:
• collections of utility methods for plugins
• fully integrated with test drivers to run test code after setup
phase of helpers (e.g. test started after created threads)

server helper: before
def start
@loop = Coolio::Loop.new
@handler = Coolio::TCPServer.new(@bind, @port, SocketUtil::TcpHandler, log,
@delimiter, method(:on_message))
@loop.attach(@handler)
@thread = Thread.new(&method(:run))
end
def shutdown
@loop.watchers.each { |w| w.detach }
@loop.stop
@handler.close
@thread.join
end
def run
@loop.run
rescue => e
log.error "unexpected error", error: e
log.error_backtrace
end
def on_message(msg, addr)
# body
end

server helper: after
def start
server_create(:foo_server, @port, bind: @bind) { |data, conn| 
# body 
}
end
https://docs.ﬂuentd.org/v1.0/articles/api-plugin-helper-server

record_accessor helper
• access / delete support for nested field
• e.g. parser’s key_name parameter uses this helper
• Provide two syntax for configuration
• $.field1.field2 == record[“field1”][“field2”]
• $[“field1”][“field2”] == record[“field1”][“field2”] 
 
 ra = record_accessor_create(”$.user.name”)
ra.call(record) # access record[”$.user”][”name”]
ra.delete(record) # delete record[”$.user”][”name”]

v0.12 plugins
ParserInput Buffer Output FormatterFilter
“output-ish”“input-ish”

v1 plugins
ParserInput Buffer Output FormatterFilter
“output-ish”“input-ish”
Storage
Helper

Other helpers
• Timer: one-shot / periodic timer
• Event Loop: Low-layer event loop
• Socket: TCP/UDP/TLS support
• Formatter/Parser: Manage parser/formatter plugins
• Chile Process: Manage process for exec like plugin
• etc…
https://docs.ﬂuentd.org/v1.0/categories/plugin-helpers

v1.2.0
• Counter API: store metrics between processes
• Need for limit calculation in multi processes
• https://github.com/fluent/fluentd/pull/1857
• Backup feature for problematic chunks
• Improve retry mechanizm for bad records
• https://github.com/fluent/fluentd/issues/1856

Focus
• Easy to use
• Stability
• Performance
• Flexibility
• Avoid fat core

v2!
• No plan…
• Remove v0.12 or earlier features

Treasure Agent 3 (td-agent 3)
• ﬂuentd v1, Ruby 2.4, systemd support and latest components
• Latest version is 3.1.1: Dec 20, 2017
• 3.2.0 will be released in March
• Environments
• Add msi Windows package, Amazon Linux 2
• Remove CentOS 5, Ubuntu 10.04 support

Containers
• Docker
• Alpine and Debian for v0.12 and v1.x
• https://github.com/fluent/fluentd-docker-image
• Kubernetes DaemonSet
• Alpine and Debian for v0.12
• Debian for v1.x (WIP)
• https://github.com/fluent/fluentd-kubernetes-daemonset
• Need other container support?

Integrations
• Kafka
• kafka-connect-ﬂuentd for high performance ingestion
• Promethuse
• ﬂuent-plugin-prometheus to push / pull for prometheus
• Integrate internal metrics with monitor_agent
• gRPC?
• Distributed tracing?

Benchmark set (WIP)
• Check configuration and performance
• Current fluentd-benchmak is not enough
• Automated test
• Various combo: ruby, fluentd, plugins
• Collect metrics: CPU, Memory, etc…
• Running on: Docker, AWS, etc…

fluent-bit
• Lightweight agent written in C
• Running on lots of environment including
embedded systems with small resource
• Pluggable architecute: Input / Parser / Filter /
Buffer / Output
• fluent-bit is useful for forwarders with fluentd 
in distributed logging
http://fluentbit.io/

Community
• Plugins / Libraries
• Thanks for maintaining the project
• Users
• Experts help new users
• Documentation
• Need feedback!

Fluentd v1 and future at techtalk

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Fluentd v1 and future at techtalk

Similar to Fluentd v1 and future at techtalk (20)

More from N Masahiro

More from N Masahiro (20)

Recently uploaded

Recently uploaded (20)

Fluentd v1 and future at techtalk