fluent-plugin-beats at Elasticsearch meetup #14

2,374 views

Published on

Introdcution of fluent-plugin-beats

Published in: Technology

fluent-plugin-beats at Elasticsearch meetup #14

  1. 1. Fluentd meets Beats Elasticsearch meetup #14 - Jan 7, 2015
  2. 2. Who are you? • Masahiro Nakagawa • github: @repeatedly • Treasure Data Inc. • Fluentd / td-agent developer • Fluentd Enterprise support • I love OSS :) • D Language, MessagePack, The organizer of several meetups, etc…
  3. 3. Beats • Agent for each purpose by Elastic • https://www.elastic.co/products/beats • official: topbeat, filebeat, packetbeat • 3rd party: dockerbeat, nginxbeat, etc… • Beats support several outputs: elasticsearch, logstash, stdout and etc. • logstash output uses lumberjack protocol so
 we can use it for communicating with Beats.
  4. 4. Fluentd • Pluggable streaming event collector • Lightweight, robust and flexible • Lots of plugins on rubygems • Used by AWS, GCP, MS and more companies • Resources • http://www.fluentd.org/ • Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
  5. 5. fluent-plugin-beats • Input plugin for Elastic Beats • https://github.com/repeatedly/fluent-plugin-beats • Use lumberjack protocol to handle events • Tested with topbeat, filebeat, packetbeat • Beats use same event format so it should work with 3rd party Beats.
  6. 6. Configuration example <source> @type beats metadata_as_tag #format nginx # for filebeat #bind 0.0.0.0 #port 5044 #max_connections 10 #tag beat.event </source> <match *beat> @type copy <store> @type elasticsearch_dynamic logstash_format true logstash_prefix ${tag_parts[0]} type_name ${record['type']} </store> <store> @type tdlog # for backup </store> </match> https://github.com/repeatedly/fluent-plugin-beats#configuration
  7. 7. Result
  8. 8. Note: Performance • Tested on Mac Book Pro, not 2 machines.
 2.6 GHz Intel Core i7, 16 GB 1600 MHz DDR3
 
 
 
 
 
 
 
 fluentd with in_tail fluent-agent-hydra filebeat 80,000 events/sec 100,000+ events/sec 18,000 events/sec Read nginx 100000 logs and count by flowcounter_simple
  9. 9. 1. Lumberjack protocol doesn’t focus on throughput • lumberjack sends/receives ack on each record
 
 
 
 
 2. Beats framework is slow? [Issue #587] • filebeat is slower than logstash-forwarder Why filebeat is slow? data frame Publish events ack ack Lumberjack protocol
  10. 10. Conclusion • Beats are useful for collecting various metrics • fluent-plugin-beats can handle Beats event
 and route events to elasitcsearch properly • Thanks fluent-plugin-elasticsearch plugin ;) • Note that filebeat is slow so it is not good
 on high volume environment • Use fluentd or fluent-agent-hydra instead

×