Fluentd - Set Up Once, Collect More

1,635 views
1,489 views

Published on

Fluentd meetup @ Rackspace San Francisco
2014-02-19

Published in: Software, Technology

Fluentd - Set Up Once, Collect More

  1. 1. Sadayuki Furuhashi Founder & Software Architect Set Up Once, Collect More. Treasure Data, inc.
  2. 2. Self-introduction > Sadayuki Furuhashi github/twitter: @frsyuki > Treasure Data, Inc. Founder & Software Architect > Open source projects MessagePack - efficient object serializer Fluentd - data collection tool ServerEngine - Ruby framework to build multiprocess servers LS4 - distributed object storage system (suspended) kumofs - distributed key-value data store (suspended)
  3. 3. What’s Fluentd? An extensible & reliable data collection tool
  4. 4. What’s Fluentd? An extensible & reliable data collection tool simple core + plugins buffering, HA (failover), load balance, etc. like syslogd
  5. 5. Blueflood MongoDB Hadoop Metrics Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Your system filter / buffer / routing
  6. 6. Blueflood MongoDB Hadoop Metrics Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Your system filter / buffer / routing
  7. 7. Blueflood MongoDB Hadoop Metrics Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Your system filter / buffer / routing
  8. 8. Blueflood MongoDB Hadoop Metrics Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Your system filter / buffer / routing
  9. 9. Input Plugins Output Plugins Buffer Plugins (Filter Plugins)
  10. 10. # logs from a file <source> type tail path /var/log/httpd.log format apache2 tag web.access </source> # logs from client libraries <source> type forward port 24224 </source> # store logs to MongoDB and S3 <match **> type copy <match> type mongo host mongo.example.com capped capped_size 200m </match> <match> type s3 path archive/ </match> </match> Fluentd
  11. 11. API servers Fluentd Rails app Fluentd Queue PerfectQueue Ruby app Fluentd Fluentd Rails app worker servers Ruby app Fluentd fluent-logger-ruby + in_forward watch server scriptout_forward in_exec Fluentd in Treasure Data
  12. 12. watch server Librato Metrics for realtime analysis Treasure Data for historical analysis out_tdlog out_metricsense ✓ streaming aggregation Fluentd in Treasure Data Fluentd
  13. 13. Internal Architecture Input Buffer Output Plugin Plugin Plugin 2012-02-04 01:33:51 myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing” } time tag record
  14. 14. Architecture :: Input plugins Input HTTP+JSON (in_http) File tail (in_tail) Syslog (in_syslog) ... Plugin ✓ Receive logs ✓ Or pull logs from data sources ✓ in non-blocking manner
  15. 15. Architecture :: Output plugins Plugin ✓ Write or send event logs Output File (out_file) Amazon S3 (out_s3) MongoDB (out_mongo) ...
  16. 16. Architecture :: Buffer plugins Plugin ✓ Improve performance ✓ Provide reliability ✓ Provide thread-safety Buffer Memory (buf_memory) File (buf_file)
  17. 17. Architecture :: Buffer plugins Plugin ✓ Improve performance ✓ Provide reliability ✓ Provide thread-safety chunk chunk chunk output Input
  18. 18. in_tail Apache buf_filein_tail Fluentd /var/log/access.log /var/log/fluentd/bufer
  19. 19. in_tail Apache buf_filein_tail Fluentd /var/log/access.log /var/log/fluentd/bufer ✓ retrying automatically, ✓ with exponential wait, ✓ and persistence on a disk.
  20. 20. in_tail Apache buf_filein_tail Fluentd /var/log/access.log /var/log/fluentd/bufer ✓ buffering for any outputs, ✓ with exponential wait, ✓ and persistence on a disk.Amazon S3 Hadoop
  21. 21. Fluentd Fluentd Fluentd fluentd applications, log files, HTTP, etc. Fluentdentd Fluentd Flu Heartbeat
  22. 22. Fluentd Fluentd Fluentd fluentd applications, log files, HTTP, etc. Fluentdentd Fluentd Flu Heartbeat ✓ load balancing or active-backup
  23. 23. class CassandraOutput < BufferedOutput Fluent::Plugin.register_output('cassandra', self) require 'cassandra' config_param :keyspace, :string config_param :columnfamily, :string config_param :host, :string, :default => 'localhost' config_param :port, :int, :default => 9160 def start super @connection = Cassandra.new(@keyspace, “#{@host}:#{@port}”) end def format(tag, time, record) record['tag'] = tag record['time'] = time record.to_msgpack end def write(chunk) chunk.msgpack_each do |record| @connection.insert(@columnfamily, "#{record["tag"]}_#{record["time"]}", record) end end end out_cassandra
  24. 24. Use cases http://www.slideshare.net/tagomoris/rubykaigi-2013-111130 “Complex Event Processing on Ruby, Fluentd and Norikra” TAGOMORI Satoshi, RubyKaigi 2013 http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013 “Log analysis system with Hadoop” NHN Japan Corp., Hadoop Conference Japan 2013 http://www.slideshare.net/sylvainkalache/fluentd-at-slideshare “fluentd at slideshare” @SylvainKalache, Fluentd meetup
  25. 25. Use cases http://www.slideshare.net/frsyuki/how-24042353 “How we use Fluentd in Treasure Data” Sadayuki Furuhashi, Fluentd meetup at slideshare http://www.slideshare.net/sematext/solr-for-indexing-and-searching-logs “Using Solr to Search and Analyze Logs” Radu Gheorghe http://docs.fluentd.org/articles/free-alternative-to-splunk-by-fluentd “Free Alternative to Splunk Using Fluentd”
  26. 26. Expected discussions... > Who are using Fluentd? > What’s the differences compared to XYZ? > Is there a plugin to send/recv data to/from XYZ? > How can my system XYZ send data to Fluentd? > Does Fluentd really work in case of XYZ?
  27. 27. Links http://fluentd.org/plugin/
  28. 28. class SomeInput < Fluent::Input Fluent::Plugin.register_input('myin', self) config_param :tag, :string def start Thread.new { while true time = Engine.new record = {“user”=>1, “size”=>1} Engine.emit(@tag, time, record) end } end def shutdown ... end end <source> type myin tag myapp.api.heartbeat </source>
  29. 29. class SomeOutput < Fluent::BufferedOutput Fluent::Plugin.register_output('myout', self) config_param :myparam, :string def format(tag, time, record) [tag, time, record].to_json + "n" end def write(chunk) puts chunk.read end end <match **> type myout myparam foobar </match>
  30. 30. class MyTailInput < Fluent::TailInput Fluent::Plugin.register_input('mytail', self) def configure_parser(conf) ... end def parse_line(line) array = line.split(“t”) record = {“user”=>array[0], “item”=>array[1]} time = Engine.now return time, record end end <source> type mytail </source>

×