Successfully reported this slideshow.
Your SlideShare is downloading. ×

Fluentd and Kafka

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Apache Kafka
Apache Kafka
Loading in …3
×

Check these out next

1 of 14 Ad

More Related Content

Slideshows for you (20)

Similar to Fluentd and Kafka (20)

Advertisement

More from N Masahiro (20)

Recently uploaded (20)

Advertisement

Fluentd and Kafka

  1. 1. Fluentd and Kafka Hadoop / Spark Conference Japan 2016
 Feb 8, 2016
  2. 2. Who are you? • Masahiro Nakagawa • github: @repeatedly • Treasure Data Inc. • Fluentd / td-agent developer • Fluentd Enterprise support • I love OSS :) • D Language, MessagePack, The organizer of several meetups, etc…
  3. 3. Fluentd • Pluggable streaming event collector • Lightweight, robust and flexible • Lots of plugins on rubygems • Used by AWS, GCP, MS and more companies • Resources • http://www.fluentd.org/ • Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
  4. 4. Popular case App Push Push Forwarder Aggregator Destination
  5. 5. • Distributed messaging system • Producer - Broker - Consumer pattern • Pull model, replication, etc
 
 
 
 
 
 Apache Kafka App Push Pull Producer Broker DestinationConsumer
  6. 6. Push vs Pull • Push: • Easy to transfer data to multiple destinations • Hard to control stream ratio in multiple streams • Pull: • Easy to control stream flow / ratio • Should manage consumers correctly
  7. 7. There are 2 ways • fluent-plugin-kafka • kafka-fluentd-consumer
  8. 8. fluent-plugin-kafka • Input / Output plugin for kafka • https://github.com/htgc/fluent-plugin-kafka • in_kafka, in_kafka_group, out_kafka, out_kafka_buffered • Pros • Easy to use and output support • Cons • Performance is not primary
  9. 9. Configuration example <source> @type kafka topics web,system format json add_prefix kafka. # more options </source> <match kafka.**> @type kafka_buffered output_data_type msgpack default_topic metrics compression_codec gzip required_acks 1 </match> https://github.com/htgc/fluent-plugin-kafka#usage
  10. 10. kafka fluentd consumer • Stand-alone kafka consumer for fluentd • https://github.com/treasure-data/kafka-fluentd-consumer • Send cosumed events to fluentd’s in_forward • Pros • High performance and Java API features • Cons • Need Java runtime
  11. 11. Run consumer • Edit log4j and fluentd-consumer properties • Run following command:
 $ java 
 -Dlog4j.configuration=file:///path/to/log4j.properties 
 -jar path/to/kafka-fluentd-consumer-0.2.1-all.jar 
 path/to/fluentd-consumer.properties
  12. 12. Properties example fluentd.tag.prefix=kafka.event. fluentd.record.format=regexp # default is json fluentd.record.pattern=(?<text>.*) # for regexp format fluentd.consumer.topics=app.* # can use Java Rege fluentd.consumer.topics.pattern=blacklist # default is whitelist fluentd.consumer.threads=5 https://github.com/treasure-data/kafka-fluentd-consumer/blob/master/config/fluentd-consumer.properties
  13. 13. With Fluentd example <source> @type forward </source> <source> @type exec command java - Dlog4j.configuration=file:///path/to/ log4j.properties -jar /path/to/kafka- fluentd-consumer-0.2.1-all.jar /path/ to/config/fluentd- consumer.properties tag dummy format json </source> https://github.com/treasure-data/kafka-fluentd-consumer#run-kafka-consumer-for-fluentd-via-in_exec
  14. 14. Conclusion • Kafka is now becomes important component
 on data platform • Fluentd can communicate with Kafka • Fluentd plugin and kafka consumer • Building reliable and flexible data pipeline with
 Fluentd and Kafka

×