May 13, 2014
www.treasuredata.com/
Fluentd v1
and Roadmap
Masahiro Nakagawa	

Treasure Data, Inc
1
Who are you?
• Masahiro Nakagawa	

• @repeatedly	

• Treasure Data, Inc.	

• Senior Software Engineer	

• Fluentd, td-agent, etc...	

• Dlang, MessagePack, ...
2
Structured logging	

!
Reliable forwarding	

!
Pluggable architecture
http://fluentd.org/
M x N → M + N
4
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
buffer / buffer / routing
v10 (gem v0.10.x)
• Mainly for log forwarding	

• with good performance	

• working in production (at Nintendo, etc.)	

• Various plugins are released	

• There are 250+ plugins!	

• Mainly for CRuby
5
v11 is gone…
• Started two years ago	

• Required dramatic API changes…	

• What does ‘v11’ mean?	

• Where is ‘v10’?	

• See the following article
6
http://repeatedly.github.io/ja/2014/03/about-fluentd-v11/
v1 is coming!
• Merge useful v11 features	

• No breaking API compatibility	

• Will go over features in later slides	

• Clear versioning and stability	

• https://github.com/fluent/fluentd/issues/251
7
Must features
8
New configuration
• Can use Hash,Array and others	

• No need for “,” or similar tricks	

• You can write Ruby code directly	

• Can use via “--use-v1-config” option	

• available since v0.10.46	

• Worker pragma (nice to have)
9
• Can write complex values without DSL!	

• Can use Ruby code for configuration













New parameter types
10
<source>	

type my_tail	

keys ["k1", "k2", "k3"]	

</source>	

!
<match **>	

typo my_filter	

add_keys {"k1" : "v1"}	

</match>
<match **>	

type my_filter	

env "#{ENV['KEY']}"	

</match>
Hash,Array, etc: Embedded Ruby code:
• Socket.gethostname	

• `command`	

• etc...
Filter / Label support
• No more tag-related tricks!	

• add_tag_xxx, remove_tag_xxx, etc...	

!
• Redirect events to another group	

• Much easier to group and share plugins
11
Filter
• <match> can have nested <match>	

• Configuration format is not fixed!













12
<match access.**>	

type flowcounter	

add_tag_prefix counted	

</match>	

!
<match counted.**>	

type growthforecast	

</match>
<match access.** copy>	

type flowcounter	

!
<match **>	

type growthforecast	

</match>	

</match>
v10: v1:
Label
• <label> can contain multiple <match>	

• out_redirect can forward events to <label>













13
<match access.**>	

type rewrite_tag_filter	

...	

<match bang.**>	

type redirect	

to_label blackhole	

</match>	

...	

</match>
<label blackhole>	

<match **>	

type null	

</match>	

</label>
bang’s records go away!
Improved plugin
• Error stream	

• with @ERROR label	

• No more global API, Engine.emit, $log, etc...	

• Log level per plugin 	

• available since v0.10.43	

• Actor (nice to have)
14
ERROR!
Error stream
• Can handle an error at each record level

15
{"event":1, ...}
{"event":2, ...}
{"event":3, ...}
chunk1
{"event":4, ...}
{"event":5, ...}
{"event":6, ...}
chunk2
…
Input
OK
ERROR!
OK
OK
OK
Output
<label @ERROR>	

<match **>	

type file	

...	

</match>	

</label>
Error stream
Built-in @ERROR is used	

when error occurred in “emit”
Nice to have features
16
ServerEngine based
• Robust signal handling	

• Put a signal into Queue first	

• Built-in supervisor	

• Multiprocess support	

• No need for in_multiprocess plugin
17
Multi Process
18
Worker
Supervisor
Worker Worker
Separate stream pipelines in one instance!
<worker>	

input tail	

output mongo	

</worker>
<worker>	

input forward	

output webhdfs	

</worker>
<worker>	

input foo	

output bar	

</worker>
• SocketManager shares resources with
workers















Zero downtime restart
19
Supervisor
TCP
1. Listen to TCP socket
• SocketManager shares resources with
workers















Zero downtime restart
20
Worker
Supervisor
heartbeat
TCP
TCP
1. Listen to TCP socket	

2. Pass its socket to worker
• SocketManager shares resources with
workers















Zero downtime restart
21
Worker
Supervisor
Worker
TCP
TCP
1. Listen to TCP socket	

2. Pass its socket to worker	

3. Do same action

at worker restarting

with keeping TCP socket
heartbeat
Actor
• Easy to write popular routines	

• Hide the implementation details

22
class TimerWatcher <	

Coolio::TimerWatcher	

...	

end	

!
def start	

@loop = Coolio::Loop.new	

@timer = ...	

@loop.attach(@timer)	

@thread = ...	

end
actor.every(@interval) {	

event_router.emit(...)	

}
v10: v1:
Others
23
JRuby and Windows
• Windows support	

• Need testers!	

• https://github.com/fluent/fluentd/tree/
windows	

• JRuby support	

• https://github.com/fluent/fluentd/issues/317
24
td-agent2
25
• Use Ruby 2.1.2	

• Update core libraries	

• msgpack, cool.io and etc	

• Use v1 config by default	

• http://docs.fluentd.org/articles/config-
file#v1-format
26
New website…?
Conclusion
27
• Fluend advances to the next stage!	

• v1 will be released	

• No breaking API compatibility	

• td-agent2 will be released in this month	

• New website with good contents?	

• Need patches or feedback!

Fluentd v1 and Roadmap

  • 1.
    May 13, 2014 www.treasuredata.com/ Fluentdv1 and Roadmap Masahiro Nakagawa Treasure Data, Inc 1
  • 2.
    Who are you? •Masahiro Nakagawa • @repeatedly • Treasure Data, Inc. • Senior Software Engineer • Fluentd, td-agent, etc... • Dlang, MessagePack, ... 2
  • 3.
  • 4.
    M x N→ M + N 4 Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Databases buffer / buffer / routing
  • 5.
    v10 (gem v0.10.x) •Mainly for log forwarding • with good performance • working in production (at Nintendo, etc.) • Various plugins are released • There are 250+ plugins! • Mainly for CRuby 5
  • 6.
    v11 is gone… •Started two years ago • Required dramatic API changes… • What does ‘v11’ mean? • Where is ‘v10’? • See the following article 6 http://repeatedly.github.io/ja/2014/03/about-fluentd-v11/
  • 7.
    v1 is coming! •Merge useful v11 features • No breaking API compatibility • Will go over features in later slides • Clear versioning and stability • https://github.com/fluent/fluentd/issues/251 7
  • 8.
  • 9.
    New configuration • Canuse Hash,Array and others • No need for “,” or similar tricks • You can write Ruby code directly • Can use via “--use-v1-config” option • available since v0.10.46 • Worker pragma (nice to have) 9
  • 10.
    • Can writecomplex values without DSL! • Can use Ruby code for configuration
 
 
 
 
 
 
 New parameter types 10 <source> type my_tail keys ["k1", "k2", "k3"] </source> ! <match **> typo my_filter add_keys {"k1" : "v1"} </match> <match **> type my_filter env "#{ENV['KEY']}" </match> Hash,Array, etc: Embedded Ruby code: • Socket.gethostname • `command` • etc...
  • 11.
    Filter / Labelsupport • No more tag-related tricks! • add_tag_xxx, remove_tag_xxx, etc... ! • Redirect events to another group • Much easier to group and share plugins 11
  • 12.
    Filter • <match> canhave nested <match> • Configuration format is not fixed!
 
 
 
 
 
 
 12 <match access.**> type flowcounter add_tag_prefix counted </match> ! <match counted.**> type growthforecast </match> <match access.** copy> type flowcounter ! <match **> type growthforecast </match> </match> v10: v1:
  • 13.
    Label • <label> cancontain multiple <match> • out_redirect can forward events to <label>
 
 
 
 
 
 
 13 <match access.**> type rewrite_tag_filter ... <match bang.**> type redirect to_label blackhole </match> ... </match> <label blackhole> <match **> type null </match> </label> bang’s records go away!
  • 14.
    Improved plugin • Errorstream • with @ERROR label • No more global API, Engine.emit, $log, etc... • Log level per plugin • available since v0.10.43 • Actor (nice to have) 14
  • 15.
    ERROR! Error stream • Canhandle an error at each record level
 15 {"event":1, ...} {"event":2, ...} {"event":3, ...} chunk1 {"event":4, ...} {"event":5, ...} {"event":6, ...} chunk2 … Input OK ERROR! OK OK OK Output <label @ERROR> <match **> type file ... </match> </label> Error stream Built-in @ERROR is used when error occurred in “emit”
  • 16.
    Nice to havefeatures 16
  • 17.
    ServerEngine based • Robustsignal handling • Put a signal into Queue first • Built-in supervisor • Multiprocess support • No need for in_multiprocess plugin 17
  • 18.
    Multi Process 18 Worker Supervisor Worker Worker Separatestream pipelines in one instance! <worker> input tail output mongo </worker> <worker> input forward output webhdfs </worker> <worker> input foo output bar </worker>
  • 19.
    • SocketManager sharesresources with workers
 
 
 
 
 
 
 
 Zero downtime restart 19 Supervisor TCP 1. Listen to TCP socket
  • 20.
    • SocketManager sharesresources with workers
 
 
 
 
 
 
 
 Zero downtime restart 20 Worker Supervisor heartbeat TCP TCP 1. Listen to TCP socket 2. Pass its socket to worker
  • 21.
    • SocketManager sharesresources with workers
 
 
 
 
 
 
 
 Zero downtime restart 21 Worker Supervisor Worker TCP TCP 1. Listen to TCP socket 2. Pass its socket to worker 3. Do same action
 at worker restarting
 with keeping TCP socket heartbeat
  • 22.
    Actor • Easy towrite popular routines • Hide the implementation details
 22 class TimerWatcher < Coolio::TimerWatcher ... end ! def start @loop = Coolio::Loop.new @timer = ... @loop.attach(@timer) @thread = ... end actor.every(@interval) { event_router.emit(...) } v10: v1:
  • 23.
  • 24.
    JRuby and Windows •Windows support • Need testers! • https://github.com/fluent/fluentd/tree/ windows • JRuby support • https://github.com/fluent/fluentd/issues/317 24
  • 25.
    td-agent2 25 • Use Ruby2.1.2 • Update core libraries • msgpack, cool.io and etc • Use v1 config by default • http://docs.fluentd.org/articles/config- file#v1-format
  • 26.
  • 27.
    Conclusion 27 • Fluend advancesto the next stage! • v1 will be released • No breaking API compatibility • td-agent2 will be released in this month • New website with good contents? • Need patches or feedback!