Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Kiyoto Tamura 
Nov 17, 2014 
RubyConf 2014 
Fluentd 
Unified Logging Layer
whoami 
Kiyoto Tamura 
GitHub/Twitter: 
kiyoto/kiyototamura 
Treasure Data, Inc. 
Director of Developer 
Relations 
Fluent...
a ruby n00b
Fluentd n00b too
why me? 
Busy writing 
code! Just gave a talk! 
I’m giving a talk! 
Busy writing 
code! 
Busy as CTO! San Diego’s nice!
What’s Fluentd? 
simple core + plugins 
like syslogd 
An extensible & reliable data collection tool 
buffering, HA (failov...
data collection tool
Metrics 
Blueflood 
Analysis 
MongoDB 
MySQL 
Hadoop 
Archiving 
Amazon S3 
Access logs 
Apache 
App logs 
Frontend 
Backe...
(this is painful!!!)
Metrics 
Blueflood 
Analysis 
MongoDB 
MySQL 
Hadoop 
Archiving 
Amazon S3 
Access logs 
Apache 
App logs 
Frontend 
Backe...
extensible
Core Plugins 
12 
• Divide & Conquer 
• Buffering & Retries 
• Error Handling 
• Message Routing 
• Parallelism 
• Read Da...
Core Plugins 
13 
• Divide & Conquer 
• Buffering & Retries 
• Error Handling 
• Message Routing 
• Parallelism 
• Read Da...
reliable
reliable data transfer
Divide & Conquer & Retry 
error retry 
retry 
error retry retry
reliable process
This? 
18
Or this? 
19
M x N → M + N 
Alerting 
Nagios 
Analysis 
MongoDB 
MySQL 
Hadoop 
Archiving 
Amazon S3 
Access logs 
Apache 
App logs 
Fr...
use cases
Simple Forwarding 
22
# logs from a file 
<source> 
type tail 
path /var/log/httpd.log 
format apache2 
tag backend.apache 
</source> 
# logs fr...
Less Simple Forwarding 
24
Lambda Architecture 
25
# logs from a file 
<source> 
type tail 
path /var/log/httpd.log 
format apache2 
tag web.access 
</source> 
# logs from c...
CEP for Stream Processing 
27
Container Logging 
28
Fluentd on Kubernetes
architecture
Internal Architecture 
Input Parser Buffer Output Formatter
Internal Architecture 
Input Parser Buffer Output Formatter 
“input-ish” “output-ish”
Input plugins 
HTTP+JSON (in_http) 
File tail (in_tail) 
Syslog (in_syslog) 
... 
✓ Receive logs 
✓ Or pull logs from data...
Input plugins 
module Fluent 
class NewTailInput < Input 
Plugin.register_input('tail', self) 
def initialize 
super 
@pat...
Input plugins 
module Fluent 
class NewTailInput < Input 
Plugin.register_input('tail', self) 
def initialize 
super 
@pat...
Input plugins 
module Fluent 
class TcpInput < SocketUtil::BaseInput 
Plugin.register_input('tcp', self) 
config_set_defau...
Input plugins 
class BaseInput < Fluent::Input 
# some code 
def on_message(msg, addr) 
@parser.parse(msg) { |time, record...
Input plugins 
class BaseInput < Fluent::Input 
# some code 
def on_message(msg, addr) 
@parser.parse(msg) { |time, record...
Parser plugins 
JSON 
Regexp 
Apache/Nginx/Syslog 
CSV/TSV, etc. 
✓ Parse into JSON 
✓ Common formats out of the box 
✓ v0...
Parser plugins 
<source> 
type tcp 
tag tcp.data 
format /^(?<field_1>d+) (?<field_2>w+)/ 
</source>
Parser plugins 
def call(text) 
m = @regexp.match(text) 
# some code 
time = nil 
record = {} 
m.names.each {|name| 
if va...
Buffer plugins 
✓ Improve performance 
✓ Provide reliability 
Buffer 
Memory (buf_memory) ✓ Provide thread-safety 
File (b...
Buffer plugins 
✓ Chunk = adjustable unit of data 
✓ Buffer = Queue of chunks 
chunk 
chunk 
chunk output 
Input
Output plugins 
✓ Write to external systems 
✓ Buffered & Non-buffered 
✓ 200+ plugins 
Output 
File (out_file) 
Amazon S3...
Output plugins 
class FileOutput < TimeSlicedOutput 
Plugin.register_output('file', self) 
# some code 
def write(chunk) 
...
Formatter plugins 
✓ Format output 
✓ Only partially supported for now 
Formatter 
JSON ✓ v0.10.49 and above 
CSV/TSV 
“si...
Formatter plugins 
class SingleValueFormatter 
include Configurable 
config_param :message_key, :string, :default => 'mess...
Internal Architecture 
Input Parser Buffer Output Formatter
Adding Filter in v0.12! 
Input Parser Filter Buffer Output Formatter
Roadmap 
50 
2014 2015 
Nov Dec Jan Feb Mar Apr May 
v0.12 
• filter 
• label 
v0.14 
• plugin API 
• ServerEngine 
V1.0!?...
goodies
fluentd-ui 
52
Treasure Agent 
• Treasure Data distribution of Fluentd 
• including Ruby, core libraries and 
QA’ed 3rd party plugins 
• ...
fluentd-forwarder 
• Forwarding agent written in Go 
• mainly for Windows support 
• less mature than Fluentd 
• Bundle TC...
Thank you! 
kiyoto@treasuredata.com 
@kiyototamura
Upcoming SlideShare
Loading in …5
×

Fluentd unified logging layer

RubyConf 2014: Building the Unified Logging Layer with Fluentd and Ruby

Fluentd unified logging layer

  1. 1. Kiyoto Tamura Nov 17, 2014 RubyConf 2014 Fluentd Unified Logging Layer
  2. 2. whoami Kiyoto Tamura GitHub/Twitter: kiyoto/kiyototamura Treasure Data, Inc. Director of Developer Relations Fluentd maintainer 2
  3. 3. a ruby n00b
  4. 4. Fluentd n00b too
  5. 5. why me? Busy writing code! Just gave a talk! I’m giving a talk! Busy writing code! Busy as CTO! San Diego’s nice!
  6. 6. What’s Fluentd? simple core + plugins like syslogd An extensible & reliable data collection tool buffering, HA (failover), load balance, etc.
  7. 7. data collection tool
  8. 8. Metrics Blueflood Analysis MongoDB MySQL Hadoop Archiving Amazon S3 Access logs Apache App logs Frontend Backend System logs syslogd Your system bash scripts ruby scripts log file cron rsync python scripts bash custom loggger other custom scripts... ✓ duplicated code for error handling... ✓ messy code for retrying mechnism...
  9. 9. (this is painful!!!)
  10. 10. Metrics Blueflood Analysis MongoDB MySQL Hadoop Archiving Amazon S3 Access logs Apache App logs Frontend Backend System logs syslogd Your system filter / buffer / route
  11. 11. extensible
  12. 12. Core Plugins 12 • Divide & Conquer • Buffering & Retries • Error Handling • Message Routing • Parallelism • Read Data • Parse Data • Buffer Data • Write Data • Format Data
  13. 13. Core Plugins 13 • Divide & Conquer • Buffering & Retries • Error Handling • Message Routing • Parallelism • Read Data • Parse Data • Buffer Data • Write Data • Format Data Common Concerns Use Case Specific
  14. 14. reliable
  15. 15. reliable data transfer
  16. 16. Divide & Conquer & Retry error retry retry error retry retry
  17. 17. reliable process
  18. 18. This? 18
  19. 19. Or this? 19
  20. 20. M x N → M + N Alerting Nagios Analysis MongoDB MySQL Hadoop Archiving Amazon S3 Access logs Apache App logs Frontend Backend System logs syslogd Databases buffer/filter/route
  21. 21. use cases
  22. 22. Simple Forwarding 22
  23. 23. # logs from a file <source> type tail path /var/log/httpd.log format apache2 tag backend.apache </source> # logs from client libraries <source> type forward port 24224 </source> # store logs to ES and HDFS <match backend.*> type mongo database fluent collection test </match>
  24. 24. Less Simple Forwarding 24
  25. 25. Lambda Architecture 25
  26. 26. # logs from a file <source> type tail path /var/log/httpd.log format apache2 tag web.access </source> # logs from client libraries <source> type forward port 24224 </source> # store logs to ES and HDFS <match backend.*> type copy <store> type elasticsearch logstash_format true </store> <store> type webhdfs host namenode port 50070 path /path/on/hdfs/ </store> </match>
  27. 27. CEP for Stream Processing 27
  28. 28. Container Logging 28
  29. 29. Fluentd on Kubernetes
  30. 30. architecture
  31. 31. Internal Architecture Input Parser Buffer Output Formatter
  32. 32. Internal Architecture Input Parser Buffer Output Formatter “input-ish” “output-ish”
  33. 33. Input plugins HTTP+JSON (in_http) File tail (in_tail) Syslog (in_syslog) ... ✓ Receive logs ✓ Or pull logs from data sources ✓ non-blocking Input
  34. 34. Input plugins module Fluent class NewTailInput < Input Plugin.register_input('tail', self) def initialize super @paths = [] @tails = {} end end # Little more code end
  35. 35. Input plugins module Fluent class NewTailInput < Input Plugin.register_input('tail', self) def initialize super @paths = [] @tails = {} end config_param :path, :string config_param :tag, :string config_param :rotate_wait, :time, :default => 5 config_param :pos_file, :string, :default => nil config_param :read_from_head, :bool, :default => false config_param :refresh_interval, :time, :default => 60 attr_reader :paths def configure(conf) super @paths = @path.split(',').map {|path| path.strip } if @paths.empty? raise ConfigError, "tail: 'path' parameter is required on tail input" end unless @pos_file $log.warn "'pos_file PATH' parameter is not set to a 'tail' source." $log.warn "this parameter is highly recommended to save the position to resume tailing." end configure_parser(conf) configure_tag @multiline_mode = conf['format'] == 'multiline' @receive_handler = if @multiline_mode method(:parse_multilines) else method(:parse_singleline) end end def configure_parser(conf) @parser = TextParser.new @parser.configure(conf) end def configure_tag if @tag.index('*') @tag_prefix, @tag_suffix = @tag.split('*') @tag_suffix ||= '' else @tag_prefix = nil @tag_suffix = nil end end def start if @pos_file @pf_file = File.open(@pos_file, File::RDWR|File::CREAT, DEFAULT_FILE_PERMISSION) @pf_file.sync = true @pf = PositionFile.parse(@pf_file) end @loop = Coolio::Loop.new refresh_watchers @refresh_trigger = TailWatcher::TimerWatcher.new(@refresh_interval, true, log, &method(:refresh_watchers)) @refresh_trigger.attach(@loop) @thread = Thread.new(&method(:run)) end def shutdown @refresh_trigger.detach if @refresh_trigger && @refresh_trigger.attached? stop_watchers(@tails.keys, true) @loop.stop rescue nil # when all watchers are detached, `stop` raises RuntimeError. We can ignore this exception. @thread.join @pf_file.close if @pf_file end def expand_paths date = Time.now paths = [] @paths.each { |path| path = date.strftime(path) if path.include?('*') paths += Dir.glob(path) else # When file is not created yet, Dir.glob returns an empty array. So just add when path is static. paths << path end } paths end # in_tail with '*' path doesn't check rotation file equality at refresh phase. # So you should not use '*' path when your logs will be rotated by another tool. # It will cause log duplication after updated watch files. # In such case, you should separate log directory and specify two paths in path parameter. # e.g. path /path/to/dir/*,/path/to/rotated_logs/target_file def refresh_watchers target_paths = expand_paths existence_paths = @tails.keys unwatched = existence_paths - target_paths added = target_paths - existence_paths 700 lines!
  36. 36. Input plugins module Fluent class TcpInput < SocketUtil::BaseInput Plugin.register_input('tcp', self) config_set_default :port, 5170 config_param :delimiter, :string, :default => "n" # syslog family add "n" to each message and this seems only way to split messages in tcp stream def listen(callback) log.debug "listening tcp socket on #{@bind}:#{@port}" Coolio::TCPServer.new(@bind, @port, SocketUtil::TcpHandler, log, @delimiter, callback) end end end
  37. 37. Input plugins class BaseInput < Fluent::Input # some code def on_message(msg, addr) @parser.parse(msg) { |time, record| unless time && record log.warn "pattern not match: #{msg.inspect}" return end record[@source_host_key] = addr[3] if @source_host_key Engine.emit(@tag, time, record) } # some code end
  38. 38. Input plugins class BaseInput < Fluent::Input # some code def on_message(msg, addr) @parser.parse(msg) { |time, record| unless time && record log.warn "pattern not match: #{msg.inspect}" return end record[@source_host_key] = addr[3] if @source_host_key Engine.emit(@tag, time, record) } # some code end
  39. 39. Parser plugins JSON Regexp Apache/Nginx/Syslog CSV/TSV, etc. ✓ Parse into JSON ✓ Common formats out of the box ✓ v0.10.46 and above Parser
  40. 40. Parser plugins <source> type tcp tag tcp.data format /^(?<field_1>d+) (?<field_2>w+)/ </source>
  41. 41. Parser plugins def call(text) m = @regexp.match(text) # some code time = nil record = {} m.names.each {|name| if value = m[name] case name when "time" time = @mutex.synchronize { @time_parser.parse(value) } else record[name] = if @type_converters.nil? value else convert_type(name, value) end end end } # some code end
  42. 42. Buffer plugins ✓ Improve performance ✓ Provide reliability Buffer Memory (buf_memory) ✓ Provide thread-safety File (buf_file)
  43. 43. Buffer plugins ✓ Chunk = adjustable unit of data ✓ Buffer = Queue of chunks chunk chunk chunk output Input
  44. 44. Output plugins ✓ Write to external systems ✓ Buffered & Non-buffered ✓ 200+ plugins Output File (out_file) Amazon S3 (out_s3) MongoDB (out_mongo) ...
  45. 45. Output plugins class FileOutput < TimeSlicedOutput Plugin.register_output('file', self) # some code def write(chunk) path = generate_path(chunk) FileUtils.mkdir_p File.dirname(path) case @compress when nil File.open(path, "a", DEFAULT_FILE_PERMISSION) {|f| chunk.write_to(f) } when :gz File.open(path, "a", DEFAULT_FILE_PERMISSION) {|f| gz = Zlib::GzipWriter.new(f) chunk.write_to(gz) gz.close } end return path # for test end # more code
  46. 46. Formatter plugins ✓ Format output ✓ Only partially supported for now Formatter JSON ✓ v0.10.49 and above CSV/TSV “single value”
  47. 47. Formatter plugins class SingleValueFormatter include Configurable config_param :message_key, :string, :default => 'message' config_param :add_newline, :bool, :default => true def format(tag, time, record) text = record[@message_key].to_s text << "n" if @add_newline text end end
  48. 48. Internal Architecture Input Parser Buffer Output Formatter
  49. 49. Adding Filter in v0.12! Input Parser Filter Buffer Output Formatter
  50. 50. Roadmap 50 2014 2015 Nov Dec Jan Feb Mar Apr May v0.12 • filter • label v0.14 • plugin API • ServerEngine V1.0!? • we can use help!
  51. 51. goodies
  52. 52. fluentd-ui 52
  53. 53. Treasure Agent • Treasure Data distribution of Fluentd • including Ruby, core libraries and QA’ed 3rd party plugins • rpm/deb/dmg • 2.1.2 is released TODAY with fluentd-ui 53
  54. 54. fluentd-forwarder • Forwarding agent written in Go • mainly for Windows support • less mature than Fluentd • Bundle TCP input/output and TD output • No plugin mechanism 54
  55. 55. Thank you! kiyoto@treasuredata.com @kiyototamura

×