Centralized + Unified Logging 
Gabor Kozma / gabo@ustream.tv / @kozmag82
Everybody wants to write logs! 
✓ Application Logs (frontend / backend) 
➢ php, java, ruby, python, bash 
✓ Access Logs 
➢ apache, nginx, tomcat, jetty 
✓ System Logs 
➢ syslog, hardware error log 
✓ Database Logs 
➢ history, transaction
Centralized Logging
Central Logging Architecture 
✓ Collection 
➢ file, syslog, database 
✓ Transport 
➢ chukwa, heka, syslog, logstash, flume, fluentd, 
kafka, nsq, nxlog, other custom solution. 
Typical: syslog-ng, rsyslog 
✓ Storage / Store 
➢ Amazon S3, Glacier, NAS ...
Central Logging Architecture 
✓ Analysis (You need a way to analyze them!) 
➢ Apache Hadoop + HDFS + Map-Reduce jobs 
■ Hive, Pig, HBase, Impala.... 
➢ Elasticsearch + Graylog2 / Kibana 
➢ MongoDB + Map-Reduce/Aggregation Framework 
➢ Graphite, Statsd + Dashboards 
✓ Alerting (Errors almost always indicate a problem!) 
➢ Airbreak/Errbit, Sentry, Honeybadger, Nagios, 
Zabbix, Open/PagerDuty
Unified Logging Layer
Unified Logging Layer 
✓ Ubiquity 
➢ Various format problem 
➢ Various source and destination 
➢ You must be optimize most of use case! 
✓ Rigidity vs. Flexibility 
➢ Apache Thrift , Apache Avro, Protocol Buffer , JSON / 
BSON, MessagePack
Unified Logging Layer 
✓ Reliability and Scalability 
➢ Scalable 
➢ Support retryable data transfer 
➢ Sync / Async data transfer 
➢ Push / Pull base system 
✓ Extensibility 
➢ Support new input / output 
■ You don’t have to modify anything else.
Fluentd - Pluggable architecture 
✓ Input, Output, Buffer, Parser, Formatter 
300+ plugins
Fluentd - Minimum res. require 
✓ Combination of C language and Ruby 
✓ 1 node 
✓ 30/40 Mbyte RAM 
✓ 1 CPU core 
13.000 event / sec
Fluentd - Built-in Reliability 
✓ Buffer 
➢ file or memory 
✓ Retrying 
✓ Error handling 
➢ transaction, failover, secondary node support 
(heartbeat)
Fluentd - Event structure (log) 
✓ Time 
➢ Second unit 
➢ From data source or adding parsed time 
✓ Tag 
➢ for message routing 
✓ Record 
➢ JSON format 
■ MessagePack internally :) 
■ none structured
Fluentd - Useful plugins 
✓ Output 
➢ stdout, file, forest, graphite, mongo, mysql, 
elasticsearch, splunk, null, s3, geoip, webhdfs 
✓ Input 
➢ syslog, tail, http, udp, tcp, scribe 
✓ Buffer 
➢ memory, file 
✓ Formatter and/or Parser 
➢ lstv, json, multiline
Examples
Fluentd - Examples 
<source> 
type tail 
format /^(?<host>[^ ]*):(?<port>[^ ]*) (?<ip>[^ ]*) (?<user>[^ ]*) (?<remotelog>[^ ]*)  
[(?<time>[^]]*)] "(?<method>S+)(?: +(?<path>[^ ]*) +S*)?" (?<code>[^ ]*) (?<size>[^ ]*) 
(?: "(?<referer>[^"]*)" ""(?<agent>[^"]*)"")?(?: "(?<referer>[^"]*)" "(?<agent> 
[^"]*)")?$/ 
path /var/log/apache2/other_vhosts_access.log.* 
pos_file /var/log/fluent/apache2.other_vhosts_access.log.pos 
time_format %d/%b/%Y:%H:%M:%S %z 
tag apache2.access.raw 
read_from_head true 
</source>
Fluentd - Examples 
<match apache2.*.raw> 
type record_reformer 
enable_ruby false 
renew_record false 
remove_keys remotelog 
tag ${tag_prefix[-2]}.reformed 
<record> 
hostname ${hostname} 
</record> 
</match> 
<match apache2.*.reformed> 
type geoip 
geoip_lookup_key ip 
geoip_database /usr/share/GeoIP/GeoIPCity.dat 
<record> 
geo_city ${city['ip']} 
... 
geo_region ${region['ip']} 
</record> 
add_tag_suffix .geoip 
flush_interval 5s 
</match>
Fluentd - Examples 
<match apache2.access.reformed.geoip> 
type forward 
flush_interval 5s 
buffer_type file 
buffer_queue_limit 512 
buffer_chunk_limit 100M 
buffer_path /opt/fluent/buffer/apache2/ 
<server> 
name hostname 
host xxx.xxx.xxx.xxx 
weight 10 
</server> 
... 
<server> 
name hostname 
host xxx.xxx.xxx.xxx 
standby 
</server> 
<secondary> 
type file 
path /var/log/fluent/forward-failed/ 
apache2/ 
</secondary> 
</match>
Fluentd - Examples 
<match apache2.access.**> 
type copy 
<store> 
type file 
path /opt/fluent/apache2/access 
time_format %Y%m%dT%H%M%S%z 
flush_interval 60s 
append true 
compress gzip 
utc 
num_threads 4 
... 
... 
</store> 
<store> 
type datacounter 
... 
</store> 
<store> 
type graphite 
... 
</store> 
</match>
Fluentd - Testimonials
Questions? 
http://www.fluentd.org 
http://docs.fluentd.org/ 
http://fluentular.herokuapp.com/ 
https://rubygems.org/search?query=fluent-plugin-http:// 
msgpack.org/

Centralized + Unified Logging

  • 1.
    Centralized + UnifiedLogging Gabor Kozma / gabo@ustream.tv / @kozmag82
  • 2.
    Everybody wants towrite logs! ✓ Application Logs (frontend / backend) ➢ php, java, ruby, python, bash ✓ Access Logs ➢ apache, nginx, tomcat, jetty ✓ System Logs ➢ syslog, hardware error log ✓ Database Logs ➢ history, transaction
  • 3.
  • 4.
    Central Logging Architecture ✓ Collection ➢ file, syslog, database ✓ Transport ➢ chukwa, heka, syslog, logstash, flume, fluentd, kafka, nsq, nxlog, other custom solution. Typical: syslog-ng, rsyslog ✓ Storage / Store ➢ Amazon S3, Glacier, NAS ...
  • 5.
    Central Logging Architecture ✓ Analysis (You need a way to analyze them!) ➢ Apache Hadoop + HDFS + Map-Reduce jobs ■ Hive, Pig, HBase, Impala.... ➢ Elasticsearch + Graylog2 / Kibana ➢ MongoDB + Map-Reduce/Aggregation Framework ➢ Graphite, Statsd + Dashboards ✓ Alerting (Errors almost always indicate a problem!) ➢ Airbreak/Errbit, Sentry, Honeybadger, Nagios, Zabbix, Open/PagerDuty
  • 6.
  • 7.
    Unified Logging Layer ✓ Ubiquity ➢ Various format problem ➢ Various source and destination ➢ You must be optimize most of use case! ✓ Rigidity vs. Flexibility ➢ Apache Thrift , Apache Avro, Protocol Buffer , JSON / BSON, MessagePack
  • 8.
    Unified Logging Layer ✓ Reliability and Scalability ➢ Scalable ➢ Support retryable data transfer ➢ Sync / Async data transfer ➢ Push / Pull base system ✓ Extensibility ➢ Support new input / output ■ You don’t have to modify anything else.
  • 10.
    Fluentd - Pluggablearchitecture ✓ Input, Output, Buffer, Parser, Formatter 300+ plugins
  • 11.
    Fluentd - Minimumres. require ✓ Combination of C language and Ruby ✓ 1 node ✓ 30/40 Mbyte RAM ✓ 1 CPU core 13.000 event / sec
  • 12.
    Fluentd - Built-inReliability ✓ Buffer ➢ file or memory ✓ Retrying ✓ Error handling ➢ transaction, failover, secondary node support (heartbeat)
  • 13.
    Fluentd - Eventstructure (log) ✓ Time ➢ Second unit ➢ From data source or adding parsed time ✓ Tag ➢ for message routing ✓ Record ➢ JSON format ■ MessagePack internally :) ■ none structured
  • 14.
    Fluentd - Usefulplugins ✓ Output ➢ stdout, file, forest, graphite, mongo, mysql, elasticsearch, splunk, null, s3, geoip, webhdfs ✓ Input ➢ syslog, tail, http, udp, tcp, scribe ✓ Buffer ➢ memory, file ✓ Formatter and/or Parser ➢ lstv, json, multiline
  • 15.
  • 16.
    Fluentd - Examples <source> type tail format /^(?<host>[^ ]*):(?<port>[^ ]*) (?<ip>[^ ]*) (?<user>[^ ]*) (?<remotelog>[^ ]*) [(?<time>[^]]*)] "(?<method>S+)(?: +(?<path>[^ ]*) +S*)?" (?<code>[^ ]*) (?<size>[^ ]*) (?: "(?<referer>[^"]*)" ""(?<agent>[^"]*)"")?(?: "(?<referer>[^"]*)" "(?<agent> [^"]*)")?$/ path /var/log/apache2/other_vhosts_access.log.* pos_file /var/log/fluent/apache2.other_vhosts_access.log.pos time_format %d/%b/%Y:%H:%M:%S %z tag apache2.access.raw read_from_head true </source>
  • 17.
    Fluentd - Examples <match apache2.*.raw> type record_reformer enable_ruby false renew_record false remove_keys remotelog tag ${tag_prefix[-2]}.reformed <record> hostname ${hostname} </record> </match> <match apache2.*.reformed> type geoip geoip_lookup_key ip geoip_database /usr/share/GeoIP/GeoIPCity.dat <record> geo_city ${city['ip']} ... geo_region ${region['ip']} </record> add_tag_suffix .geoip flush_interval 5s </match>
  • 18.
    Fluentd - Examples <match apache2.access.reformed.geoip> type forward flush_interval 5s buffer_type file buffer_queue_limit 512 buffer_chunk_limit 100M buffer_path /opt/fluent/buffer/apache2/ <server> name hostname host xxx.xxx.xxx.xxx weight 10 </server> ... <server> name hostname host xxx.xxx.xxx.xxx standby </server> <secondary> type file path /var/log/fluent/forward-failed/ apache2/ </secondary> </match>
  • 19.
    Fluentd - Examples <match apache2.access.**> type copy <store> type file path /opt/fluent/apache2/access time_format %Y%m%dT%H%M%S%z flush_interval 60s append true compress gzip utc num_threads 4 ... ... </store> <store> type datacounter ... </store> <store> type graphite ... </store> </match>
  • 20.
  • 21.
    Questions? http://www.fluentd.org http://docs.fluentd.org/ http://fluentular.herokuapp.com/ https://rubygems.org/search?query=fluent-plugin-http:// msgpack.org/