Your SlideShare is downloading. ×
0
Fluentd Introduction
at iPROS
Masahiro Nakagawa
Treasuare Data, Inc.
Senior Software Engineer

Thursday, October 31, 13
Who are you?
>
●

Masahiro Nakagawa
>

>
●

Treasure Data, Inc
>

>
●

@repeatedly / masa@treasure-data.com
Senior Softwar...
Structured logging
Reliable forwarding

http://fluentd.org/
Thursday, October 31, 13

Pluggable architecture
Agenda

>

Background

>

Overview

>

Product Comparison

>

Use cases

Thursday, October 31, 13
Background

Thursday, October 31, 13
Data Processing
Data source
Collect

Reporting
Monitoring
Thursday, October 31, 13

Store

Process

Visualize
Related Products
easier & shorter time

Collect

???

Thursday, October 31, 13

Store Process

Cloudera
Horton Works
Treas...
Thursday, October 31, 13
Before Fluentd
Server1

Server2

Server3

Application

Application

Application

・・・

・・・

・・・

High Latency!
Log Server
F...
After Fluentd
Server1

Server2

Server3

Application

Application

Application

Fluentd

Fluentd

Fluentd

・・・

・・・

・・・

...
Overview

Thursday, October 31, 13
In short

>

Open sourced log collector written in Ruby

>

Using rubygems ecosystem for plugins

It’s like syslogd, but
u...
Example (apache to mongo)
2013-10-30 01:33:51
apache.log

Web Server

{
"host": "127.0.0.1",
"method": "GET",
...

tail

1...
Event structure(log message)
✓ Time
>

default second unit

>

from data source or
adding parsed time

✓ Tag
>

for messag...
Pluggable Architecture
Pluggable

Pluggable

Output
Input

> rewrite
> ...

Engine
Buffer
> Forward
> HTTP
> File tail
> d...
Client libraries
> Ruby
> Java
> Perl
> PHP
> Python
>D
> Scala
> ...

Application
Time:Tag:Record

Fluentd

# Ruby
Fluent...
Configuration and operation

●
>

No central / master node
>

●
>

HTTP include helps conf sharing

Operation depends on yo...
# receive events via HTTP
<source>
type http
port 8888
</source>

# save alerts to a file
<match alert.**>
type file
path /v...
Reliability (core + plugin)
>
●

Buffering
>

Use file buffer for persistent data

>

buffer chunk has ID for idempotent

...
Plugins - use rubygems

$ fluent-gem search -rd fluent-plugin

$ fluent-gem search -rd fluent-mixin

$ fluent-gem install ...
http://fluentd.org/plugin/
Thursday, October 31, 13
in_tail
Fluentd

Apache

access.log
Supported format:
>

apache

>

json

>

apache2

>

csv

>

syslog

>

tsv

>

nginx
...
out_mongo
Apache

access.log

Fluentd

buffer

✓ retry automatically
✓ exponential retry wait
✓ persistent on a file

Thurs...
out_webhdfs
Apache

✓ custom text formatter

Fluentd

access.log

buffer

✓ slice files based on time
2013-01-01/01/access....
out_copy + other plugins
Hadoop

Apache

access.log

Fluentd

buffer

Amazon S3
✓ routing based on tags
✓ copy to multiple...
out_forward

✓ automatic fail-over
✓ load balancing

Fluentd
apache
Apache

Fluentd

Fluentd
Fluentd

access.log

buffer

...
Forward topology
Fluentd
Fluentd

send/ack

Fluentd

send/ack

Fluentd
Fluentd
Fluentd

Thursday, October 31, 13

Fluentd
Other plugins
>
●

Filter, Aggregator, Converter
> rewrite-tag-filter, sampling-filter, ...
> *-counter, *-monitor, ...
> ...
Access logs
Apache

Alerting
Nagios

App logs
Frontend
Backend

Analysis
MongoDB
MySQL
Hadoop

System logs
syslogd
Databas...
Other status
>
●

Localizing docs into Japanese
>

>
●

https://github.com/fluent/fluentd-docs/tree/
master/docs/ja

Windo...
v11
>
●

Spec is not fixed yet

>
●

Breaking source code compatibility

>
●

Several improvments
>
>

>
●

routing label,...
td-agent
>
●

Open sourced distribution package of Fluentd
>
>

>
●

ETL part of Treasure Data
deb, rpm, homebrew

Includi...
Product Comparison

Thursday, October 31, 13
Flume
Flume: distributed log collector by Cloudera
Phisical
Topology

Flume Master

Flume

Logical
Topology

Thursday, Oct...
Network topology
Master

ack

Agent

Flume OG

Agent
Agent

Collector
Collector
Collector

Agent
Master
Agent
Agent

Flume...
Pros and Cons
>
●

Pros
>

>
●

Using central master to manage all nodes

Cons
>

Java culture (Pros for Java-er?)
Difficu...
Logstash
http://logstash.net/

Thursday, October 31, 13
Pros and Cons
>
●

Pros
>
>

Built-in ElasticSearch and Kibana

>
>
●

Bundled 140 plugins (input/filter/codec/output)
Wor...
Use cases

Thursday, October 31, 13
Treasure Data
Worker

Frontend

Hadoop

Job Queue

Hadoop
Applications push
metrics to Fluentd
(via local Fluentd)

Treasu...
Cookpad
hundreds of app servers

Rails app

td-agent
sends event logs

Rails app

td-agent

Daily/Hourly
Batch

Treasure D...
LINE
Web
Servers

Archive
Storage
(scribed)

Fluentd
Cluster
STREAM

Fluentd
Watchers

webhdfs

✓ 16 nodes
✓ 120,000+ line...
Other use-cases
>
●

Scaleout by @choplin
> データサイエンティスト養成読本
>

http://gihyo.jp/book/2013/978-4-7741-5896-9

>
●

Smartnews...
Other companies

Thursday, October 31, 13
Conclusion

>
●

Fluentd is now a widely-used project
>
>

>
●

There are many use cases
Many contributors and plugins

Ke...
support@treasure-data.com
Thursday, October 31, 13
support@treasure-data.com
Thursday, October 31, 13
Upcoming SlideShare
Loading in...5
×

Fluentd introduction at ipros

4,934

Published on

Fluentd presentatino slide at #iprostm

http://atnd.org/events/44556

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,934
On Slideshare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
37
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "Fluentd introduction at ipros"

  1. 1. Fluentd Introduction at iPROS Masahiro Nakagawa Treasuare Data, Inc. Senior Software Engineer Thursday, October 31, 13
  2. 2. Who are you? > ● Masahiro Nakagawa > > ● Treasure Data, Inc > > ● @repeatedly / masa@treasure-data.com Senior Software Engineer, since 2012/11 Open Source Projects > D programming Language > MessagePack, Fluentd, etc... Thursday, October 31, 13
  3. 3. Structured logging Reliable forwarding http://fluentd.org/ Thursday, October 31, 13 Pluggable architecture
  4. 4. Agenda > Background > Overview > Product Comparison > Use cases Thursday, October 31, 13
  5. 5. Background Thursday, October 31, 13
  6. 6. Data Processing Data source Collect Reporting Monitoring Thursday, October 31, 13 Store Process Visualize
  7. 7. Related Products easier & shorter time Collect ??? Thursday, October 31, 13 Store Process Cloudera Horton Works Treasure Data Visualize Excel Tableau R
  8. 8. Thursday, October 31, 13
  9. 9. Before Fluentd Server1 Server2 Server3 Application Application Application ・・・ ・・・ ・・・ High Latency! Log Server Fluent Thursday, October 31, 13 must wait for a day...
  10. 10. After Fluentd Server1 Server2 Server3 Application Application Application Fluentd Fluentd Fluentd ・・・ ・・・ ・・・ In streaming! Fluentd Thursday, October 31, 13 Fluentd
  11. 11. Overview Thursday, October 31, 13
  12. 12. In short > Open sourced log collector written in Ruby > Using rubygems ecosystem for plugins It’s like syslogd, but uses JSON for log messages Thursday, October 31, 13
  13. 13. Example (apache to mongo) 2013-10-30 01:33:51 apache.log Web Server { "host": "127.0.0.1", "method": "GET", ... tail 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1 - - [30/Oct/2013:07:26:27] [30/Oct/2013:07:26:30] [30/Oct/2013:07:26:32] [30/Oct/2013:07:26:40] [30/Oct/2013:07:27:01] ... Thursday, October 31, 13 "GET "GET "GET "GET "GET / / / / / ... ... ... ... ... } Fluentd event buffering insert
  14. 14. Event structure(log message) ✓ Time > default second unit > from data source or adding parsed time ✓ Tag > for message routing Thursday, October 31, 13 ✓ Record > JSON format > MessagePack internally > non-unstructured
  15. 15. Pluggable Architecture Pluggable Pluggable Output Input > rewrite > ... Engine Buffer > Forward > HTTP > File tail > dstat > ... Thursday, October 31, 13 > File > Memory Output > Forward > File > MongoDB > ...
  16. 16. Client libraries > Ruby > Java > Perl > PHP > Python >D > Scala > ... Application Time:Tag:Record Fluentd # Ruby Fluent.open(“myapp”) Fluent.event(“login”, {“user” => 38}) #=> 2013-10-30 18:56:01 myapp.login Thursday, October 31, 13 {“user”:38}
  17. 17. Configuration and operation ● > No central / master node > ● > HTTP include helps conf sharing Operation depends on your environment > > ● > Use your deamon management Use Chef in Treasure Data Apache like syntax and Ruby DSL Thursday, October 31, 13
  18. 18. # receive events via HTTP <source> type http port 8888 </source> # save alerts to a file <match alert.**> type file path /var/log/fluent/alerts </match> # read logs from a file <source> type tail path /var/log/httpd.log format apache tag apache.access </source> # forward other logs to servers <match **> type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server> </match> # save access logs to MongoDB <match apache.access> type mongo database apache collection log </match> Thursday, October 31, 13 include http://example.com/conf
  19. 19. Reliability (core + plugin) > ● Buffering > Use file buffer for persistent data > buffer chunk has ID for idempotent > ● Retrying > ● Error handling > transaction, failover, etc on forward plugin > secondary Thursday, October 31, 13
  20. 20. Plugins - use rubygems $ fluent-gem search -rd fluent-plugin $ fluent-gem search -rd fluent-mixin $ fluent-gem install fluent-plugin-mongo Thursday, October 31, 13
  21. 21. http://fluentd.org/plugin/ Thursday, October 31, 13
  22. 22. in_tail Fluentd Apache access.log Supported format: > apache > json > apache2 > csv > syslog > tsv > nginx > ltsv Thursday, October 31, 13 ✓ read a log file ✓ custom regexp ✓ custom parser in Ruby
  23. 23. out_mongo Apache access.log Fluentd buffer ✓ retry automatically ✓ exponential retry wait ✓ persistent on a file Thursday, October 31, 13
  24. 24. out_webhdfs Apache ✓ custom text formatter Fluentd access.log buffer ✓ slice files based on time 2013-01-01/01/access.log.gz 2013-01-01/02/access.log.gz 2013-01-01/03/access.log.gz ... Thursday, October 31, 13 HDFS ✓ retry automatically ✓ exponential retry wait ✓ persistent on a file
  25. 25. out_copy + other plugins Hadoop Apache access.log Fluentd buffer Amazon S3 ✓ routing based on tags ✓ copy to multiple storages Thursday, October 31, 13
  26. 26. out_forward ✓ automatic fail-over ✓ load balancing Fluentd apache Apache Fluentd Fluentd Fluentd access.log buffer ✓ retry automatically ✓ exponential retry wait ✓ persistent on a file Thursday, October 31, 13
  27. 27. Forward topology Fluentd Fluentd send/ack Fluentd send/ack Fluentd Fluentd Fluentd Thursday, October 31, 13 Fluentd
  28. 28. Other plugins > ● Filter, Aggregator, Converter > rewrite-tag-filter, sampling-filter, ... > *-counter, *-monitor, ... > record-modifier, flatten, map, typecast, ... > ● See @tagomoris’s slide > http://www.slideshare.net/tagomoris/fluentdmeetupfukuoka201303 Thursday, October 31, 13
  29. 29. Access logs Apache Alerting Nagios App logs Frontend Backend Analysis MongoDB MySQL Hadoop System logs syslogd Databases Thursday, October 31, 13 filter / buffer / routing Archiving Amazon S3
  30. 30. Other status > ● Localizing docs into Japanese > > ● https://github.com/fluent/fluentd-docs/tree/ master/docs/ja Windows support > Started by JBAT https://github.com/fluent/fluentd/tree/windows > Thursday, October 31, 13 Feedback and patch are welcome!
  31. 31. v11 > ● Spec is not fixed yet > ● Breaking source code compatibility > ● Several improvments > > > ● routing label, filter, error stream, etc. serverengine based: multi-process, signal, etc. http://magazine.rubyist.net/?0044FluentdV11NewFeatures Thursday, October 31, 13
  32. 32. td-agent > ● Open sourced distribution package of Fluentd > > > ● ETL part of Treasure Data deb, rpm, homebrew Including useful components > > > ● ruby, jemalloc, fluentd 3rd party gems: td, mongo, webhdfs, etc... http://packages.treasure-data.com/ Thursday, October 31, 13
  33. 33. Product Comparison Thursday, October 31, 13
  34. 34. Flume Flume: distributed log collector by Cloudera Phisical Topology Flume Master Flume Logical Topology Thursday, October 31, 13 Flume Flume Hadoop HDFS
  35. 35. Network topology Master ack Agent Flume OG Agent Agent Collector Collector Collector Agent Master Agent Agent Flume NG Agent Agent Thursday, October 31, 13 Collector send Option send/ack Collector Collector
  36. 36. Pros and Cons > ● Pros > > ● Using central master to manage all nodes Cons > Java culture (Pros for Java-er?) Difficult configuration and setup > Difficult topology > Mainly for Hadoop less plugins? Thursday, October 31, 13
  37. 37. Logstash http://logstash.net/ Thursday, October 31, 13
  38. 38. Pros and Cons > ● Pros > > Built-in ElasticSearch and Kibana > > ● Bundled 140 plugins (input/filter/codec/output) Works on Windows but unstable... Cons > mainly for JRuby > Need external daemon for centralized env Redis, RabbitMQ or etc Thursday, October 31, 13
  39. 39. Use cases Thursday, October 31, 13
  40. 40. Treasure Data Worker Frontend Hadoop Job Queue Hadoop Applications push metrics to Fluentd (via local Fluentd) Treasure Data for historical analysis Thursday, October 31, 13 Fluentd Fluentd sums up data minutes (partial aggregation) Librato Metrics for realtime analysis
  41. 41. Cookpad hundreds of app servers Rails app td-agent sends event logs Rails app td-agent Daily/Hourly Batch Treasure Data sends event logs Rails app MySQL td-agent sends event logs Unlimited scalability Flexible schema Realtime Less performance impact Thursday, October 31, 13 Google Spreadsheet Logs are available after several mins. Feedback rankings KPI visualization ✓ Over 100 RoR servers (2012/2/4)
  42. 42. LINE Web Servers Archive Storage (scribed) Fluentd Cluster STREAM Fluentd Watchers webhdfs ✓ 16 nodes ✓ 120,000+ lines/sec ✓ 400Mbps at peak ✓ 1.5+ TB/day (raw) Hadoop Cluster CDH4 (HDFS, YARN) Notifications (IRC) hive server Huahin Manager BATCH Graph Tools SCHEDULED BATCH Shib ShibUI http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013 by @tagomoris Thursday, October 31, 13
  43. 43. Other use-cases > ● Scaleout by @choplin > データサイエンティスト養成読本 > http://gihyo.jp/book/2013/978-4-7741-5896-9 > ● Smartnews > http://developer.smartnews.be/blog/tag/ fluentd/ > ● ニンテンドー3DS すれちがい通信 > Thursday, October 31, 13 http://www.nintendo.co.jp/3ds/interview/ streetpass_relay/vol1/index4.html
  44. 44. Other companies Thursday, October 31, 13
  45. 45. Conclusion > ● Fluentd is now a widely-used project > > > ● There are many use cases Many contributors and plugins Keep it simple > Thursday, October 31, 13 Easy to use and integrate your environment
  46. 46. support@treasure-data.com Thursday, October 31, 13
  47. 47. support@treasure-data.com Thursday, October 31, 13
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×