SlideShare a Scribd company logo
1 of 14
Download to read offline
Fluentd and Kafka
Hadoop / Spark Conference Japan 2016

Feb 8, 2016
Who are you?
• Masahiro Nakagawa
• github: @repeatedly
• Treasure Data Inc.
• Fluentd / td-agent developer
• Fluentd Enterprise support
• I love OSS :)
• D Language, MessagePack, The organizer of several meetups, etc…
Fluentd
• Pluggable streaming event collector
• Lightweight, robust and flexible
• Lots of plugins on rubygems
• Used by AWS, GCP, MS and more companies
• Resources
• http://www.fluentd.org/
• Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
Popular case
App
Push
Push
Forwarder Aggregator Destination
• Distributed messaging system
• Producer - Broker - Consumer pattern
• Pull model, replication, etc











Apache Kafka
App
Push
Pull
Producer Broker DestinationConsumer
Push vs Pull
• Push:
• Easy to transfer data to multiple destinations
• Hard to control stream ratio in multiple streams
• Pull:
• Easy to control stream flow / ratio
• Should manage consumers correctly
There are 2 ways
• fluent-plugin-kafka
• kafka-fluentd-consumer
fluent-plugin-kafka
• Input / Output plugin for kafka
• https://github.com/htgc/fluent-plugin-kafka
• in_kafka, in_kafka_group, out_kafka, out_kafka_buffered
• Pros
• Easy to use and output support
• Cons
• Performance is not primary
Configuration example
<source>
@type kafka
topics web,system
format json
add_prefix kafka.
# more options
</source>
<match kafka.**>
@type kafka_buffered
output_data_type msgpack
default_topic metrics
compression_codec gzip
required_acks 1
</match>
https://github.com/htgc/fluent-plugin-kafka#usage
kafka fluentd consumer
• Stand-alone kafka consumer for fluentd
• https://github.com/treasure-data/kafka-fluentd-consumer
• Send cosumed events to fluentd’s in_forward
• Pros
• High performance and Java API features
• Cons
• Need Java runtime
Run consumer
• Edit log4j and fluentd-consumer properties
• Run following command:

$ java 

-Dlog4j.configuration=file:///path/to/log4j.properties 

-jar path/to/kafka-fluentd-consumer-0.2.1-all.jar 

path/to/fluentd-consumer.properties
Properties example
fluentd.tag.prefix=kafka.event.
fluentd.record.format=regexp # default is json
fluentd.record.pattern=(?<text>.*) # for regexp format
fluentd.consumer.topics=app.* # can use Java Rege
fluentd.consumer.topics.pattern=blacklist # default is whitelist
fluentd.consumer.threads=5
https://github.com/treasure-data/kafka-fluentd-consumer/blob/master/config/fluentd-consumer.properties
With Fluentd example
<source>
@type forward
</source>
<source>
@type exec
command java -
Dlog4j.configuration=file:///path/to/
log4j.properties -jar /path/to/kafka-
fluentd-consumer-0.2.1-all.jar /path/
to/config/fluentd-
consumer.properties
tag dummy
format json
</source>
https://github.com/treasure-data/kafka-fluentd-consumer#run-kafka-consumer-for-fluentd-via-in_exec
Conclusion
• Kafka is now becomes important component

on data platform
• Fluentd can communicate with Kafka
• Fluentd plugin and kafka consumer
• Building reliable and flexible data pipeline with

Fluentd and Kafka

More Related Content

What's hot

Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatAmazon Web Services
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesDatabricks
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Elastic Stack & Data pipeline
Elastic Stack & Data pipelineElastic Stack & Data pipeline
Elastic Stack & Data pipelineJongho Woo
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Effectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarEffectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarMatteo Merli
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Building Microservices with Apache Kafka
Building Microservices with Apache KafkaBuilding Microservices with Apache Kafka
Building Microservices with Apache Kafkaconfluent
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin	Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin Vietnam Open Infrastructure User Group
 
Knative, Serverless on Kubernetes, and Openshift
Knative, Serverless on Kubernetes, and OpenshiftKnative, Serverless on Kubernetes, and Openshift
Knative, Serverless on Kubernetes, and OpenshiftChris Suszyński
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developersconfluent
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리confluent
 
KFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingKFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingAnimesh Singh
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of PrestoTaro L. Saito
 

What's hot (20)

Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Elastic Stack & Data pipeline
Elastic Stack & Data pipelineElastic Stack & Data pipeline
Elastic Stack & Data pipeline
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Effectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarEffectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache Pulsar
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 
Building Microservices with Apache Kafka
Building Microservices with Apache KafkaBuilding Microservices with Apache Kafka
Building Microservices with Apache Kafka
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin	Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
 
Knative, Serverless on Kubernetes, and Openshift
Knative, Serverless on Kubernetes, and OpenshiftKnative, Serverless on Kubernetes, and Openshift
Knative, Serverless on Kubernetes, and Openshift
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리
 
KFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingKFServing - Serverless Model Inferencing
KFServing - Serverless Model Inferencing
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of Presto
 

Similar to Fluentd and Kafka

Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Lucas Jellema
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14N Masahiro
 
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...Lucas Jellema
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017Monal Daxini
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...StreamNative
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOSconN Masahiro
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Guido Schmutz
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lakeTimothy Spann
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...ssuserd3a367
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceTimothy Spann
 
AWS 2020 Year in Review reInvent ReCap
AWS 2020 Year in Review reInvent ReCapAWS 2020 Year in Review reInvent ReCap
AWS 2020 Year in Review reInvent ReCapCloudHesive
 
kafka-tutorial-cloudruable-v2.pdf
kafka-tutorial-cloudruable-v2.pdfkafka-tutorial-cloudruable-v2.pdf
kafka-tutorial-cloudruable-v2.pdfPriyamTomar1
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaGuido Schmutz
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...LINE Corporation
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 

Similar to Fluentd and Kafka (20)

Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14fluent-plugin-beats at Elasticsearch meetup #14
fluent-plugin-beats at Elasticsearch meetup #14
 
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOScon
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lake
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
 
AWS 2020 Year in Review reInvent ReCap
AWS 2020 Year in Review reInvent ReCapAWS 2020 Year in Review reInvent ReCap
AWS 2020 Year in Review reInvent ReCap
 
kafka-tutorial-cloudruable-v2.pdf
kafka-tutorial-cloudruable-v2.pdfkafka-tutorial-cloudruable-v2.pdf
kafka-tutorial-cloudruable-v2.pdf
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 

More from N Masahiro

Fluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUFluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUN Masahiro
 
Fluentd v1 and future at techtalk
Fluentd v1 and future at techtalkFluentd v1 and future at techtalk
Fluentd v1 and future at techtalkN Masahiro
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconN Masahiro
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellN Masahiro
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellN Masahiro
 
Presto changes
Presto changesPresto changes
Presto changesN Masahiro
 
Fluentd v0.14 Overview
Fluentd v0.14 OverviewFluentd v0.14 Overview
Fluentd v0.14 OverviewN Masahiro
 
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12N Masahiro
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and FluentdN Masahiro
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataN Masahiro
 
Fluentd v0.12 master guide
Fluentd v0.12 master guideFluentd v0.12 master guide
Fluentd v0.12 master guideN Masahiro
 
Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4N Masahiro
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015N Masahiro
 
Fluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaFluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaN Masahiro
 
Treasure Data and OSS
Treasure Data and OSSTreasure Data and OSS
Treasure Data and OSSN Masahiro
 
Fluentd - RubyKansai 65
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65N Masahiro
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -N Masahiro
 
Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014N Masahiro
 
SQL for Everything at CWT2014
SQL for Everything at CWT2014SQL for Everything at CWT2014
SQL for Everything at CWT2014N Masahiro
 
Can you say the same words even in oss
Can you say the same words even in ossCan you say the same words even in oss
Can you say the same words even in ossN Masahiro
 

More from N Masahiro (20)

Fluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EUFluentd Project Intro at Kubecon 2019 EU
Fluentd Project Intro at Kubecon 2019 EU
 
Fluentd v1 and future at techtalk
Fluentd v1 and future at techtalkFluentd v1 and future at techtalk
Fluentd v1 and future at techtalk
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at Kubecon
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
Presto changes
Presto changesPresto changes
Presto changes
 
Fluentd v0.14 Overview
Fluentd v0.14 OverviewFluentd v0.14 Overview
Fluentd v0.14 Overview
 
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdata
 
Fluentd v0.12 master guide
Fluentd v0.12 master guideFluentd v0.12 master guide
Fluentd v0.12 master guide
 
Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4Fluentd and Embulk Game Server 4
Fluentd and Embulk Game Server 4
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015
 
Fluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At FossasiaFluentd Unified Logging Layer At Fossasia
Fluentd Unified Logging Layer At Fossasia
 
Treasure Data and OSS
Treasure Data and OSSTreasure Data and OSS
Treasure Data and OSS
 
Fluentd - RubyKansai 65
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65
 
Fluentd - road to v1 -
Fluentd - road to v1 -Fluentd - road to v1 -
Fluentd - road to v1 -
 
Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014Fluentd: Unified Logging Layer at CWT2014
Fluentd: Unified Logging Layer at CWT2014
 
SQL for Everything at CWT2014
SQL for Everything at CWT2014SQL for Everything at CWT2014
SQL for Everything at CWT2014
 
Can you say the same words even in oss
Can you say the same words even in ossCan you say the same words even in oss
Can you say the same words even in oss
 

Recently uploaded

Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesSanjay Willie
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Fluentd and Kafka

  • 1. Fluentd and Kafka Hadoop / Spark Conference Japan 2016
 Feb 8, 2016
  • 2. Who are you? • Masahiro Nakagawa • github: @repeatedly • Treasure Data Inc. • Fluentd / td-agent developer • Fluentd Enterprise support • I love OSS :) • D Language, MessagePack, The organizer of several meetups, etc…
  • 3. Fluentd • Pluggable streaming event collector • Lightweight, robust and flexible • Lots of plugins on rubygems • Used by AWS, GCP, MS and more companies • Resources • http://www.fluentd.org/ • Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
  • 5. • Distributed messaging system • Producer - Broker - Consumer pattern • Pull model, replication, etc
 
 
 
 
 
 Apache Kafka App Push Pull Producer Broker DestinationConsumer
  • 6. Push vs Pull • Push: • Easy to transfer data to multiple destinations • Hard to control stream ratio in multiple streams • Pull: • Easy to control stream flow / ratio • Should manage consumers correctly
  • 7. There are 2 ways • fluent-plugin-kafka • kafka-fluentd-consumer
  • 8. fluent-plugin-kafka • Input / Output plugin for kafka • https://github.com/htgc/fluent-plugin-kafka • in_kafka, in_kafka_group, out_kafka, out_kafka_buffered • Pros • Easy to use and output support • Cons • Performance is not primary
  • 9. Configuration example <source> @type kafka topics web,system format json add_prefix kafka. # more options </source> <match kafka.**> @type kafka_buffered output_data_type msgpack default_topic metrics compression_codec gzip required_acks 1 </match> https://github.com/htgc/fluent-plugin-kafka#usage
  • 10. kafka fluentd consumer • Stand-alone kafka consumer for fluentd • https://github.com/treasure-data/kafka-fluentd-consumer • Send cosumed events to fluentd’s in_forward • Pros • High performance and Java API features • Cons • Need Java runtime
  • 11. Run consumer • Edit log4j and fluentd-consumer properties • Run following command:
 $ java 
 -Dlog4j.configuration=file:///path/to/log4j.properties 
 -jar path/to/kafka-fluentd-consumer-0.2.1-all.jar 
 path/to/fluentd-consumer.properties
  • 12. Properties example fluentd.tag.prefix=kafka.event. fluentd.record.format=regexp # default is json fluentd.record.pattern=(?<text>.*) # for regexp format fluentd.consumer.topics=app.* # can use Java Rege fluentd.consumer.topics.pattern=blacklist # default is whitelist fluentd.consumer.threads=5 https://github.com/treasure-data/kafka-fluentd-consumer/blob/master/config/fluentd-consumer.properties
  • 13. With Fluentd example <source> @type forward </source> <source> @type exec command java - Dlog4j.configuration=file:///path/to/ log4j.properties -jar /path/to/kafka- fluentd-consumer-0.2.1-all.jar /path/ to/config/fluentd- consumer.properties tag dummy format json </source> https://github.com/treasure-data/kafka-fluentd-consumer#run-kafka-consumer-for-fluentd-via-in_exec
  • 14. Conclusion • Kafka is now becomes important component
 on data platform • Fluentd can communicate with Kafka • Fluentd plugin and kafka consumer • Building reliable and flexible data pipeline with
 Fluentd and Kafka