Fluentd introduction at ipros
Upcoming SlideShare
Loading in...5
×
 

Fluentd introduction at ipros

on

  • 4,681 views

Fluentd presentatino slide at #iprostm

Fluentd presentatino slide at #iprostm

http://atnd.org/events/44556

Statistics

Views

Total Views
4,681
Views on SlideShare
3,094
Embed Views
1,587

Actions

Likes
5
Downloads
27
Comments
0

15 Embeds 1,587

http://ipros-creators.tumblr.com 969
http://sstd-bigdata.blogspot.jp 528
http://wiki.home.wols.org 33
http://assets.txmblr.com 25
https://assets.txmblr.com 8
http://sstd-bigdata.blogspot.com 8
http://cloud.feedly.com 3
https://twitter.com 2
http://app.imcreator.com 2
http://www.google.co.jp 2
http://newsblur.com 2
http://mobile.home.wols.org 2
http://sstd-bigdata.blogspot.kr 1
http://sstd-bigdata.blogspot.co.uk 1
http://feedly.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Fluentd introduction at ipros Fluentd introduction at ipros Presentation Transcript

  • Fluentd Introduction at iPROS Masahiro Nakagawa Treasuare Data, Inc. Senior Software Engineer Thursday, October 31, 13
  • Who are you? > ● Masahiro Nakagawa > > ● Treasure Data, Inc > > ● @repeatedly / masa@treasure-data.com Senior Software Engineer, since 2012/11 Open Source Projects > D programming Language > MessagePack, Fluentd, etc... Thursday, October 31, 13
  • Structured logging Reliable forwarding http://fluentd.org/ Thursday, October 31, 13 Pluggable architecture
  • Agenda > Background > Overview > Product Comparison > Use cases Thursday, October 31, 13
  • Background Thursday, October 31, 13
  • Data Processing Data source Collect Reporting Monitoring Thursday, October 31, 13 Store Process Visualize
  • Related Products easier & shorter time Collect ??? Thursday, October 31, 13 Store Process Cloudera Horton Works Treasure Data Visualize Excel Tableau R
  • Thursday, October 31, 13
  • Before Fluentd Server1 Server2 Server3 Application Application Application ・・・ ・・・ ・・・ High Latency! Log Server Fluent Thursday, October 31, 13 must wait for a day...
  • After Fluentd Server1 Server2 Server3 Application Application Application Fluentd Fluentd Fluentd ・・・ ・・・ ・・・ In streaming! Fluentd Thursday, October 31, 13 Fluentd
  • Overview Thursday, October 31, 13
  • In short > Open sourced log collector written in Ruby > Using rubygems ecosystem for plugins It’s like syslogd, but uses JSON for log messages Thursday, October 31, 13
  • Example (apache to mongo) 2013-10-30 01:33:51 apache.log Web Server { "host": "127.0.0.1", "method": "GET", ... tail 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1 - - [30/Oct/2013:07:26:27] [30/Oct/2013:07:26:30] [30/Oct/2013:07:26:32] [30/Oct/2013:07:26:40] [30/Oct/2013:07:27:01] ... Thursday, October 31, 13 "GET "GET "GET "GET "GET / / / / / ... ... ... ... ... } Fluentd event buffering insert
  • Event structure(log message) ✓ Time > default second unit > from data source or adding parsed time ✓ Tag > for message routing Thursday, October 31, 13 ✓ Record > JSON format > MessagePack internally > non-unstructured
  • Pluggable Architecture Pluggable Pluggable Output Input > rewrite > ... Engine Buffer > Forward > HTTP > File tail > dstat > ... Thursday, October 31, 13 > File > Memory Output > Forward > File > MongoDB > ...
  • Client libraries > Ruby > Java > Perl > PHP > Python >D > Scala > ... Application Time:Tag:Record Fluentd # Ruby Fluent.open(“myapp”) Fluent.event(“login”, {“user” => 38}) #=> 2013-10-30 18:56:01 myapp.login Thursday, October 31, 13 {“user”:38}
  • Configuration and operation ● > No central / master node > ● > HTTP include helps conf sharing Operation depends on your environment > > ● > Use your deamon management Use Chef in Treasure Data Apache like syntax and Ruby DSL Thursday, October 31, 13
  • # receive events via HTTP <source> type http port 8888 </source> # save alerts to a file <match alert.**> type file path /var/log/fluent/alerts </match> # read logs from a file <source> type tail path /var/log/httpd.log format apache tag apache.access </source> # forward other logs to servers <match **> type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server> </match> # save access logs to MongoDB <match apache.access> type mongo database apache collection log </match> Thursday, October 31, 13 include http://example.com/conf
  • Reliability (core + plugin) > ● Buffering > Use file buffer for persistent data > buffer chunk has ID for idempotent > ● Retrying > ● Error handling > transaction, failover, etc on forward plugin > secondary Thursday, October 31, 13
  • Plugins - use rubygems $ fluent-gem search -rd fluent-plugin $ fluent-gem search -rd fluent-mixin $ fluent-gem install fluent-plugin-mongo Thursday, October 31, 13
  • http://fluentd.org/plugin/ Thursday, October 31, 13
  • in_tail Fluentd Apache access.log Supported format: > apache > json > apache2 > csv > syslog > tsv > nginx > ltsv Thursday, October 31, 13 ✓ read a log file ✓ custom regexp ✓ custom parser in Ruby
  • out_mongo Apache access.log Fluentd buffer ✓ retry automatically ✓ exponential retry wait ✓ persistent on a file Thursday, October 31, 13
  • out_webhdfs Apache ✓ custom text formatter Fluentd access.log buffer ✓ slice files based on time 2013-01-01/01/access.log.gz 2013-01-01/02/access.log.gz 2013-01-01/03/access.log.gz ... Thursday, October 31, 13 HDFS ✓ retry automatically ✓ exponential retry wait ✓ persistent on a file
  • out_copy + other plugins Hadoop Apache access.log Fluentd buffer Amazon S3 ✓ routing based on tags ✓ copy to multiple storages Thursday, October 31, 13
  • out_forward ✓ automatic fail-over ✓ load balancing Fluentd apache Apache Fluentd Fluentd Fluentd access.log buffer ✓ retry automatically ✓ exponential retry wait ✓ persistent on a file Thursday, October 31, 13
  • Forward topology Fluentd Fluentd send/ack Fluentd send/ack Fluentd Fluentd Fluentd Thursday, October 31, 13 Fluentd
  • Other plugins > ● Filter, Aggregator, Converter > rewrite-tag-filter, sampling-filter, ... > *-counter, *-monitor, ... > record-modifier, flatten, map, typecast, ... > ● See @tagomoris’s slide > http://www.slideshare.net/tagomoris/fluentdmeetupfukuoka201303 Thursday, October 31, 13
  • Access logs Apache Alerting Nagios App logs Frontend Backend Analysis MongoDB MySQL Hadoop System logs syslogd Databases Thursday, October 31, 13 filter / buffer / routing Archiving Amazon S3
  • Other status > ● Localizing docs into Japanese > > ● https://github.com/fluent/fluentd-docs/tree/ master/docs/ja Windows support > Started by JBAT https://github.com/fluent/fluentd/tree/windows > Thursday, October 31, 13 Feedback and patch are welcome!
  • v11 > ● Spec is not fixed yet > ● Breaking source code compatibility > ● Several improvments > > > ● routing label, filter, error stream, etc. serverengine based: multi-process, signal, etc. http://magazine.rubyist.net/?0044FluentdV11NewFeatures Thursday, October 31, 13
  • td-agent > ● Open sourced distribution package of Fluentd > > > ● ETL part of Treasure Data deb, rpm, homebrew Including useful components > > > ● ruby, jemalloc, fluentd 3rd party gems: td, mongo, webhdfs, etc... http://packages.treasure-data.com/ Thursday, October 31, 13
  • Product Comparison Thursday, October 31, 13
  • Flume Flume: distributed log collector by Cloudera Phisical Topology Flume Master Flume Logical Topology Thursday, October 31, 13 Flume Flume Hadoop HDFS
  • Network topology Master ack Agent Flume OG Agent Agent Collector Collector Collector Agent Master Agent Agent Flume NG Agent Agent Thursday, October 31, 13 Collector send Option send/ack Collector Collector
  • Pros and Cons > ● Pros > > ● Using central master to manage all nodes Cons > Java culture (Pros for Java-er?) Difficult configuration and setup > Difficult topology > Mainly for Hadoop less plugins? Thursday, October 31, 13
  • Logstash http://logstash.net/ Thursday, October 31, 13
  • Pros and Cons > ● Pros > > Built-in ElasticSearch and Kibana > > ● Bundled 140 plugins (input/filter/codec/output) Works on Windows but unstable... Cons > mainly for JRuby > Need external daemon for centralized env Redis, RabbitMQ or etc Thursday, October 31, 13
  • Use cases Thursday, October 31, 13
  • Treasure Data Worker Frontend Hadoop Job Queue Hadoop Applications push metrics to Fluentd (via local Fluentd) Treasure Data for historical analysis Thursday, October 31, 13 Fluentd Fluentd sums up data minutes (partial aggregation) Librato Metrics for realtime analysis
  • Cookpad hundreds of app servers Rails app td-agent sends event logs Rails app td-agent Daily/Hourly Batch Treasure Data sends event logs Rails app MySQL td-agent sends event logs Unlimited scalability Flexible schema Realtime Less performance impact Thursday, October 31, 13 Google Spreadsheet Logs are available after several mins. Feedback rankings KPI visualization ✓ Over 100 RoR servers (2012/2/4)
  • LINE Web Servers Archive Storage (scribed) Fluentd Cluster STREAM Fluentd Watchers webhdfs ✓ 16 nodes ✓ 120,000+ lines/sec ✓ 400Mbps at peak ✓ 1.5+ TB/day (raw) Hadoop Cluster CDH4 (HDFS, YARN) Notifications (IRC) hive server Huahin Manager BATCH Graph Tools SCHEDULED BATCH Shib ShibUI http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013 by @tagomoris Thursday, October 31, 13
  • Other use-cases > ● Scaleout by @choplin > データサイエンティスト養成読本 > http://gihyo.jp/book/2013/978-4-7741-5896-9 > ● Smartnews > http://developer.smartnews.be/blog/tag/ fluentd/ > ● ニンテンドー3DS すれちがい通信 > Thursday, October 31, 13 http://www.nintendo.co.jp/3ds/interview/ streetpass_relay/vol1/index4.html
  • Other companies Thursday, October 31, 13
  • Conclusion > ● Fluentd is now a widely-used project > > > ● There are many use cases Many contributors and plugins Keep it simple > Thursday, October 31, 13 Easy to use and integrate your environment
  • support@treasure-data.com Thursday, October 31, 13
  • support@treasure-data.com Thursday, October 31, 13