SlideShare a Scribd company logo
1 of 52
Download to read offline
Fluentd ♥ MongoDB
                          Log Everything As JSON


                 Kazuki Ohta, CTO at Treasure Data, Inc.




Tuesday, July 17, 2012
Self-Introduction
           •       Kazuki Ohta
                   >     twitter: @kzk_mover
                   >     github: kzk

           •       Treasure Data, Inc.
                   >     Chief Technology Officer; Founder
                   >     Original Fluentd Author @frsyuki is another co-founder.

           •       Open-Source Enthusiast
                   >     KDE, uim, Hadoop, memcached, Mozilla, Mongo, etc.
                   >     Fluentd rpm/deb package manager
                                                                              2
Tuesday, July 17, 2012
Logging? Why?




Tuesday, July 17, 2012
Figure 1: Common Logging Purposes




                                                  Analytics

                                                  Error Notification

                                                  Recommendation


                                                                   4
Tuesday, July 17, 2012
Figure 2: Types of Logs




                                           App Log

                                           Access Log
                                           (Apache, Rails, etc.)
                                           System Log
                                           (syslog etc.)
                                           Others
                                                                   5
Tuesday, July 17, 2012
From “Scaling Lessons learned at Dropbox”
                                                            6
Tuesday, July 17, 2012
Fragile for format change,
                         No type information,
                         No field name, etc.


                         From “Scaling Lessons learned at Dropbox”
                                                            6
Tuesday, July 17, 2012
About Fluentd




Tuesday, July 17, 2012
It's like syslogd, but uses JSON for log
                 messages


                                                            8
Tuesday, July 17, 2012
Logs in JSON? Why?

                     1. Machine-Readable
                     > machine is goint to be a main consumer of logs


                     2. Schema-Free
                     > you want to add/remove fields from logs at anytime



    Write Logs for Machines, use JSON
    http://journal.paul.querna.org/articles/2011/12/26/log-for-machines-in-json/
                                                                            9
Tuesday, July 17, 2012
Logs As TEXT


   Logs As JSON



                         + Field Name
                         + No Custom Parser
                         + Type Information
                         + Schema Free

                                       10
Tuesday, July 17, 2012
Logs As TEXT
         “2011-04-01 host1 myapp: cmessage size=12MB user=me”


   Logs As JSON
                         2011-04-01 myapp.message {
                             “on_host”: ”host1”,
                             ”combined”: true,
                             “size”: 12000000,     + Field Name
                             “user”: “me”          + No Custom Parser
                                                   + Type Information
                         }                         + Schema Free

                                                                 10
Tuesday, July 17, 2012
http://fluentd.org/




                                              11
Tuesday, July 17, 2012
•       Website
                   >     http://fluentd.org/

           •       Community
                   >     http://github.com/fluent
                   >     16 committers across
                         many organizations
                   >     web, game, enterprise

           •       Mailing list
                   >     Google groups

                                                   12
Tuesday, July 17, 2012
Fluentd Architecture




Tuesday, July 17, 2012
Fluentd: Log Format

                         Application



                          Fluentd




                          Storage


                                                14
Tuesday, July 17, 2012
Fluentd: Log Format

                         Application

                                       2012-02-04 01:33:51
                                       myapp.buylog {
                          Fluentd
                                           “user”: ”me”,
                                           “path”: “/buyItem”,
                                           “price”: 150,
                                           “referer”: “/landing”
                          Storage      }


                                                                   14
Tuesday, July 17, 2012
Fluentd: Log Format

                                                       time
                         Application                    tag
                                       2012-02-04 01:33:51
                                       myapp.buylog {
                          Fluentd
                                           “user”: ”me”,
                                           “path”: “/buyItem”,
                                           “price”: 150,
                                           “referer”: “/landing”
                          Storage      }
                                                    record

                                                                   14
Tuesday, July 17, 2012
Fluentd: Plugins

                             Application



                                           filter / buffer /
                              Fluentd
                                           routing




                              Storage


                                                              15
Tuesday, July 17, 2012
Fluentd: Plugins

                                       Application



                                                     filter / buffer /
                                        Fluentd
                                                     routing




                          SaaS          Storage            Fluentd

                         Plug-in        Plug-in           Plug-in
                                                                        15
Tuesday, July 17, 2012
Fluentd: Plugins

                                       Application



                                                     filter / buffer /
                                        Fluentd
                                                     routing




                          SaaS          Storage            Fluentd

                         Plug-in        Plug-in           Plug-in
                                                                        16
Tuesday, July 17, 2012
Fluentd: Plugins

            syslogd         Scribe     Application          File Plug-in

                                                     tail
           Plug-in Plug-in
                                                      filter / buffer /
                                        Fluentd
                                                      routing




                          SaaS          Storage                Fluentd

                         Plug-in        Plug-in               Plug-in
                                                                           16
Tuesday, July 17, 2012
•       Client libraries
                   > Ruby
                   > Perl             Application         Buffering

                   > PHP
                                            HTTP / TCP / UDS
                   > Python
                   > Java              Fluentd
                   > ...




                                                                17
Tuesday, July 17, 2012
•       Client libraries
                   > Ruby
                   > Perl               Application         Buffering

                   > PHP
                                              HTTP / TCP / UDS
                   > Python
                   > Java                Fluentd
                   > ...


            Fluent.open(“myapp”)
            Fluent.event(“login”, {“user”=>38})
            #=> 2012-02-04 04:56:01 myapp.login    {“user”:38}

                                                                  17
Tuesday, July 17, 2012
Typical Log Collection by `rsync`




               Burst of traffic
               rsync consumes
               all bandwidth



                                                             18
Tuesday, July 17, 2012
Typical Log Collection by `rsync`
                     App server              App server              App server

                   Application              Application            Application


               File File File ...          File File File ...     File File File ...


                                    File
               Burst of traffic                                 High latency
               rsync consumes                                   must wait for a day
               all bandwidth                 Log server         Hard to analyze
                                                                complex text parsers

                                                                                  18
Tuesday, July 17, 2012
Log Collection using Fluentd

                         Fluentd        Fluentd          Fluentd



                                                       Realtime!
                                   Fluentd   Fluentd




                                                                   19
Tuesday, July 17, 2012
Log Collection using Fluentd

                         Fluentd        Fluentd          Fluentd



                                                       Realtime!
                                   Fluentd   Fluentd



                                              Amazon     Ready to
                         Hadoop    Mongo
                                               S3 /
                          / Hive    DB
                                               EMR       Analyze!

                                                                    19
Tuesday, July 17, 2012
Fluentd Case Study
               Ruby on Rails              Ruby on Rails          Ruby on Rails


                         Fluentd              Fluentd               Fluentd




      ✓    127 RoR servers
      ✓    100,000 msgs/sec             Fluentd    Fluentd      routing
      ✓    120Mbps at peak
      ✓    1TB/day

                                      Hadoop            Mongo     User behavior
                           PV logs     / Hive            DB       logs

                                                                                 20
Tuesday, July 17, 2012
# read logs from a file         # forward other logs to servers
      <source>                        # (load-balancing + fail-over)
        type tail                     <match **>
        path /var/log/httpd.log         type forward
        format apache                   <server>
        tag apache.access                 host 192.168.0.11
      </source>                           weight 20
                                        </server>
      # save access logs to MongoDB     <server>
      <match apache.access>               host 192.168.0.12
        type mongo                        weight 60
        host 127.0.0.1                  </server>
      </match>                        </match>




Tuesday, July 17, 2012
Comparison




Tuesday, July 17, 2012
Scribe: log collector by
                               Facebook
                         Frontend servers

                                            Aggregator nodes
                             scribe
                                                scribe
                             scribe
                                                               Hadoop
                                                                HDFS
                             scribe
                                                scribe
                             scribe

                                                                        23
Tuesday, July 17, 2012
Scribe’s Pros & Cons
                • Pros.
                         • Fast (written in C++)
                • Cons.
                         • VERY HARD to install
                            • nightmare of boost, thrift, libhdfs, etc.
                         • Unstructured Logs
                            • parsing must be required before the analysis
                         • Hard to extend
                            • recompiling C++ programs are required
                         • No longer maintained

                                                                             24
Tuesday, July 17, 2012
Fluentd vs Scribe
                • Easy to install
                         • “gem install fluentd”
                         • Stable RPM and Deb packages
                           • http://packages.treasure-data.com/
                • Easy to write plugins
                         • you can use Ruby
                • Easy plugin distribution
                         • “gem search -rd fluent-plugin”


                                                                  25
Tuesday, July 17, 2012
Flume: distributed log collector by Cloudera

           Phisical
                                 Flume Master
          Topology

                         Flume      Flume       Flume




           Logical                                      Hadoop
          Topology                                       HDFS


                                                             26
Tuesday, July 17, 2012
Flume’s Pros & Cons
                • Pros.
                         • Central master server manages all nodes
                • Cons.
                         • Difficult to understand
                            • logical topologies, phisical servers and a
                              configuration of the logical/phisical mapping
                         • Difficult to configure
                            • replicated master servers, log servers and agents
                         • Big footprint
                            • 50,000 lines of Java

                                                                                  27
Tuesday, July 17, 2012
Fluentd vs Flume
                 • Easy to understand
                         • “syslogd that understands JSON”
                 • Easy to setup
                         • “sudo fluentd --setup && fluentd”
                 • Very small footprint
                         • small engine (3,000) lines + plugins
                         • small, but battle-tested!
                 • Easy to configure


                                                                  28
Tuesday, July 17, 2012
Fluentd           Scribe           Flume
          Installation          gem/rpm/deb          make          jar/rpm/deb

                                 3000 lines of    8000 lines of   50,000 lines of
          Footprint                 Ruby             C++              Java

          Plugin                    Ruby              N/A             Java

          Plugin distribution   RubyGems.org          N/A              N/A

          Master Server              No               No               Yes

          License               Apache License   Apache License   Apache License


                                                                                 29
Tuesday, July 17, 2012
Fluentd Plugin for




Tuesday, July 17, 2012
fluent-plugin-mongo
                • Included within rpm/deb by default!
                         • http://github.com/fluent/fluent-plugin-mongo
                • #1 plugin among 50+ Fluentd plugins
                         • Logs As JSON. WHY NOT Put Them Into Mongo??
                         • http://fluentd.org/plugin/
                • Supports most of the MongoDB features
                         • Authentication
                         • ReplicaSet
                         • Capped Collection

                                                                          31
Tuesday, July 17, 2012
• MongoDB Output Plugin
                     Application                           • Maintain JSON Structure
                                                           • Reliable Buffering
                                                           • Batch Insertion
                         Fluentd       Buffering           • Handle Broken Records
                                                             • Ruby Driver #82
                             Authentication


                         MongoDB              MongoDB               MongoDB    MongoDB
                                                                    MongoDB    MongoDB
                     Single Instance                                MongoDB    MongoDB
                    (Capped or Not)     MongoDB     MongoDB
                                                                          Sharding
                                              ReplicaSet

                                                                                     32
Tuesday, July 17, 2012
• MongoDB Output Plugin
                     Application                           • Maintain JSON Structure
                                                           • Reliable Buffering
                                                           • Batch Insertion
                         Fluentd       Buffering           • Handle Broken Records
                                                             • Ruby Driver #82
                             Authentication


                         MongoDB              MongoDB               MongoDB    MongoDB
                                                                    MongoDB    MongoDB
                     Single Instance                                MongoDB    MongoDB
                    (Capped or Not)     MongoDB     MongoDB
                                                                          Sharding
                                              ReplicaSet

                                                                                     32
Tuesday, July 17, 2012
ReplicaSet
                                          (Capped Collection)
             Single Instance
           (Capped Collection)                MongoDB

                    MongoDB          MongoDB        MongoDB


                         Authentication


                    Fluentd          Buffering
                                                        • MongoDB Input Plugin
                                                           • Tailing Capped Collections


                                                                                    33
Tuesday, July 17, 2012
ReplicaSet
                                          (Capped Collection)
             Single Instance
           (Capped Collection)                MongoDB

                    MongoDB          MongoDB        MongoDB


                         Authentication


                    Fluentd          Buffering
                                                        • MongoDB Input Plugin
                                                           • Tailing Capped Collections


                                                                                    33
Tuesday, July 17, 2012
Realtime Analytics with Fluentd + MongoDB

                          App                    App                 App


                         Fluentd             Fluentd                Fluentd




                             routing   Fluentd         Fluentd


          Nagios, Zabbix, etc.
                                            Mongo          query
                                                                   Charting
                         Alert               DB
                                                                              34
Tuesday, July 17, 2012
Realtime or Batch? No, BOTH!

                          App                          App                 App


                         Fluentd                   Fluentd                Fluentd




                             routing         Fluentd         Fluentd




        Hadoop                     Amazon         Mongo          query
                                                                         Charting
         / Hive                      S3            DB
             batch                 archive         realtime                         35
Tuesday, July 17, 2012
Intro of our company’s service: Treasure Data

                          App                    App                    App


                         Fluentd             Fluentd                   Fluentd




                             routing   Fluentd         Fluentd




      Treasure                              Mongo                Hadoop-based
        Data                                 DB                  Cloud Data Warehouse
             batch                           realtime                            36
Tuesday, July 17, 2012
Exercise: Apache Logs into MongoDB




Tuesday, July 17, 2012
Log File




                                    38
Tuesday, July 17, 2012
39
Tuesday, July 17, 2012
40
Tuesday, July 17, 2012
Conclusion
                • Log Everything as JSON
                         • Machine Readability
                         • Schema Freeness
                • MongoDB fits into Fluentd’s backend perfectly
                         • Both using JSON representation




                                                                  41
Tuesday, July 17, 2012

More Related Content

What's hot

Cassandraとh baseの比較して入門するno sql
Cassandraとh baseの比較して入門するno sqlCassandraとh baseの比較して入門するno sql
Cassandraとh baseの比較して入門するno sqlYutuki r
 
PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説Masahiko Sawada
 
[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata
[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata
[B23] PostgreSQLのインデックス・チューニング by Tomonari KatsumataInsight Technology, Inc.
 
[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要Google Cloud Platform - Japan
 
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...NTT DATA Technology & Innovation
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep InternalEXEM
 
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)NTT DATA Technology & Innovation
 
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)NTT DATA Technology & Innovation
 
Airflow를 이용한 데이터 Workflow 관리
Airflow를 이용한  데이터 Workflow 관리Airflow를 이용한  데이터 Workflow 관리
Airflow를 이용한 데이터 Workflow 관리YoungHeon (Roy) Kim
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateBobby Curtis
 
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Yoshiyasu SAEKI
 
明日から使えるPostgre sql運用管理テクニック(監視編)
明日から使えるPostgre sql運用管理テクニック(監視編)明日から使えるPostgre sql運用管理テクニック(監視編)
明日から使えるPostgre sql運用管理テクニック(監視編)kasaharatt
 
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017Amazon Web Services
 
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)NTT DATA Technology & Innovation
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
これからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようこれからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようNobuyuki Sasaki
 
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~Miki Shimogai
 

What's hot (20)

Cassandraとh baseの比較して入門するno sql
Cassandraとh baseの比較して入門するno sqlCassandraとh baseの比較して入門するno sql
Cassandraとh baseの比較して入門するno sql
 
PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説
 
[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata
[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata
[B23] PostgreSQLのインデックス・チューニング by Tomonari Katsumata
 
[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要
 
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
 
Fluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log ManagementFluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log Management
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
 
Airflow를 이용한 데이터 Workflow 관리
Airflow를 이용한  데이터 Workflow 관리Airflow를 이용한  데이터 Workflow 관리
Airflow를 이용한 데이터 Workflow 관리
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGate
 
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
 
明日から使えるPostgre sql運用管理テクニック(監視編)
明日から使えるPostgre sql運用管理テクニック(監視編)明日から使えるPostgre sql運用管理テクニック(監視編)
明日から使えるPostgre sql運用管理テクニック(監視編)
 
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
 
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
これからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみようこれからLDAPを始めるなら 「389-ds」を使ってみよう
これからLDAPを始めるなら 「389-ds」を使ってみよう
 
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
 
KeycloakでAPI認可に入門する
KeycloakでAPI認可に入門するKeycloakでAPI認可に入門する
KeycloakでAPI認可に入門する
 

Similar to Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012

Fluentd: the missing log collector
Fluentd: the missing log collectorFluentd: the missing log collector
Fluentd: the missing log collectortd_kiyoto
 
Symfony2 and MongoDB
Symfony2 and MongoDBSymfony2 and MongoDB
Symfony2 and MongoDBPablo Godel
 
Symfony2 y MongoDB - deSymfony 2012
Symfony2 y MongoDB - deSymfony 2012Symfony2 y MongoDB - deSymfony 2012
Symfony2 y MongoDB - deSymfony 2012Pablo Godel
 
oEmbed in Drupal
oEmbed in DrupaloEmbed in Drupal
oEmbed in DrupalPure Sign
 
Developing RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDBDeveloping RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDBNicola Iarocci
 
Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...
Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...
Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...Swiss Big Data User Group
 
Who Pulls the Strings?
Who Pulls the Strings?Who Pulls the Strings?
Who Pulls the Strings?Ronny Trommer
 
Multilingual solutions florian loretan
Multilingual solutions florian loretanMultilingual solutions florian loretan
Multilingual solutions florian loretandrupalconf
 
You rang, M’LOD? Google Refine in the world of LOD
You rang, M’LOD? Google Refine in the world of LODYou rang, M’LOD? Google Refine in the world of LOD
You rang, M’LOD? Google Refine in the world of LODMateja Verlic
 
Presentation mongodb public sector dbsig malaysia
Presentation mongodb public sector dbsig malaysiaPresentation mongodb public sector dbsig malaysia
Presentation mongodb public sector dbsig malaysiaSyahman Mohamad
 
Building businesspost.ie using Node.js
Building businesspost.ie using Node.jsBuilding businesspost.ie using Node.js
Building businesspost.ie using Node.jsRichard Rodger
 
Games for the Masses (QCon London 2012)
Games for the Masses (QCon London 2012)Games for the Masses (QCon London 2012)
Games for the Masses (QCon London 2012)Wooga
 
ORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout sessionORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout sessionGudmundur Thorisson
 
EDF2012 Chris Taggart - How the biggest Open Database of Companies was built
EDF2012   Chris Taggart - How the biggest Open Database of Companies was builtEDF2012   Chris Taggart - How the biggest Open Database of Companies was built
EDF2012 Chris Taggart - How the biggest Open Database of Companies was builtEuropean Data Forum
 
Osm techniques and developemnt
Osm techniques and developemntOsm techniques and developemnt
Osm techniques and developemntDongpo Deng
 
MongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupMongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupScott Hernandez
 
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...Wooga
 

Similar to Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012 (20)

Fluentd: the missing log collector
Fluentd: the missing log collectorFluentd: the missing log collector
Fluentd: the missing log collector
 
Symfony2 and MongoDB
Symfony2 and MongoDBSymfony2 and MongoDB
Symfony2 and MongoDB
 
Symfony2 y MongoDB - deSymfony 2012
Symfony2 y MongoDB - deSymfony 2012Symfony2 y MongoDB - deSymfony 2012
Symfony2 y MongoDB - deSymfony 2012
 
oEmbed in Drupal
oEmbed in DrupaloEmbed in Drupal
oEmbed in Drupal
 
Developing RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDBDeveloping RESTful Web APIs with Python, Flask and MongoDB
Developing RESTful Web APIs with Python, Flask and MongoDB
 
The Heron Mapping Client
The Heron Mapping ClientThe Heron Mapping Client
The Heron Mapping Client
 
Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...
Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...
Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zur...
 
Who Pulls the Strings?
Who Pulls the Strings?Who Pulls the Strings?
Who Pulls the Strings?
 
Multilingual solutions florian loretan
Multilingual solutions florian loretanMultilingual solutions florian loretan
Multilingual solutions florian loretan
 
Jenkins Evolutions
Jenkins EvolutionsJenkins Evolutions
Jenkins Evolutions
 
You rang, M’LOD? Google Refine in the world of LOD
You rang, M’LOD? Google Refine in the world of LODYou rang, M’LOD? Google Refine in the world of LOD
You rang, M’LOD? Google Refine in the world of LOD
 
Presentation mongodb public sector dbsig malaysia
Presentation mongodb public sector dbsig malaysiaPresentation mongodb public sector dbsig malaysia
Presentation mongodb public sector dbsig malaysia
 
Building businesspost.ie using Node.js
Building businesspost.ie using Node.jsBuilding businesspost.ie using Node.js
Building businesspost.ie using Node.js
 
Games for the Masses (QCon London 2012)
Games for the Masses (QCon London 2012)Games for the Masses (QCon London 2012)
Games for the Masses (QCon London 2012)
 
ORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout sessionORCID Outreach Meeting dev breakout session
ORCID Outreach Meeting dev breakout session
 
EDF2012 Chris Taggart - How the biggest Open Database of Companies was built
EDF2012   Chris Taggart - How the biggest Open Database of Companies was builtEDF2012   Chris Taggart - How the biggest Open Database of Companies was built
EDF2012 Chris Taggart - How the biggest Open Database of Companies was built
 
Osm techniques and developemnt
Osm techniques and developemntOsm techniques and developemnt
Osm techniques and developemnt
 
Orientação a objetos v2
Orientação a objetos v2Orientação a objetos v2
Orientação a objetos v2
 
MongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupMongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF Meetup
 
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
Games for the Masses - Wie DevOps die Entwicklung von Architektur verändert (...
 

More from Treasure Data, Inc.

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersTreasure Data, Inc.
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketTreasure Data, Inc.
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data PlatformsTreasure Data, Inc.
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowTreasure Data, Inc.
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsTreasure Data, Inc.
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataTreasure Data, Inc.
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataTreasure Data, Inc.
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data DotsTreasure Data, Inc.
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessTreasure Data, Inc.
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Treasure Data, Inc.
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)Treasure Data, Inc.
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallTreasure Data, Inc.
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataTreasure Data, Inc.
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...Treasure Data, Inc.
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to RedshiftTreasure Data, Inc.
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudTreasure Data, Inc.
 

More from Treasure Data, Inc. (20)

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for Marketers
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and Market
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
 
Hands On: Javascript SDK
Hands On: Javascript SDKHands On: Javascript SDK
Hands On: Javascript SDK
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with Data
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without Data
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data Dots
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company Success
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
 
Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of Hivemall
 
Scalable Hadoop in the cloud
Scalable Hadoop in the cloudScalable Hadoop in the cloud
Scalable Hadoop in the cloud
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 

Recently uploaded

Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Real-time Geospatial Aircraft Monitoring Using Apache Kafka
Real-time Geospatial Aircraft Monitoring Using Apache KafkaReal-time Geospatial Aircraft Monitoring Using Apache Kafka
Real-time Geospatial Aircraft Monitoring Using Apache KafkaHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
Data Contracts In Practice With Debezium and Apache Flink
Data Contracts In Practice With Debezium and Apache FlinkData Contracts In Practice With Debezium and Apache Flink
Data Contracts In Practice With Debezium and Apache FlinkHostedbyConfluent
 
Introduction to Cybersecurity | IIT(BHU)CyberSec
Introduction to Cybersecurity | IIT(BHU)CyberSecIntroduction to Cybersecurity | IIT(BHU)CyberSec
Introduction to Cybersecurity | IIT(BHU)CyberSecYashSomalkar
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024BookNet Canada
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
Event-Driven Microservices: Back to the Basics
Event-Driven Microservices: Back to the BasicsEvent-Driven Microservices: Back to the Basics
Event-Driven Microservices: Back to the BasicsHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
Aggregating Ad Events with Kafka Streams and Interactive Queries at InvidiAggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
Aggregating Ad Events with Kafka Streams and Interactive Queries at InvidiHostedbyConfluent
 
The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...
The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...
The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...HostedbyConfluent
 
Tecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for RotogravureTecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for RotogravureAntonio de Llamas
 
Bridge to the Future: Migrating to KRaft
Bridge to the Future: Migrating to KRaftBridge to the Future: Migrating to KRaft
Bridge to the Future: Migrating to KRaftHostedbyConfluent
 
Web Development Solutions 2024 A Beginner's Comprehensive Handbook.pdf
Web Development Solutions 2024 A Beginner's Comprehensive Handbook.pdfWeb Development Solutions 2024 A Beginner's Comprehensive Handbook.pdf
Web Development Solutions 2024 A Beginner's Comprehensive Handbook.pdfSeasia Infotech
 
Automation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementAutomation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementDianaGray10
 
How to Build an Event-based Control Center for the Electrical Grid
How to Build an Event-based Control Center for the Electrical GridHow to Build an Event-based Control Center for the Electrical Grid
How to Build an Event-based Control Center for the Electrical GridHostedbyConfluent
 
CERN IoT Kafka Pipelines | Kafka Summit London
CERN IoT Kafka Pipelines | Kafka Summit LondonCERN IoT Kafka Pipelines | Kafka Summit London
CERN IoT Kafka Pipelines | Kafka Summit LondonHostedbyConfluent
 
Error Handling with Kafka: From Patterns to Code
Error Handling with Kafka: From Patterns to CodeError Handling with Kafka: From Patterns to Code
Error Handling with Kafka: From Patterns to CodeHostedbyConfluent
 
The Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemThe Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemSafe Software
 

Recently uploaded (20)

Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Real-time Geospatial Aircraft Monitoring Using Apache Kafka
Real-time Geospatial Aircraft Monitoring Using Apache KafkaReal-time Geospatial Aircraft Monitoring Using Apache Kafka
Real-time Geospatial Aircraft Monitoring Using Apache Kafka
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
Data Contracts In Practice With Debezium and Apache Flink
Data Contracts In Practice With Debezium and Apache FlinkData Contracts In Practice With Debezium and Apache Flink
Data Contracts In Practice With Debezium and Apache Flink
 
Introduction to Cybersecurity | IIT(BHU)CyberSec
Introduction to Cybersecurity | IIT(BHU)CyberSecIntroduction to Cybersecurity | IIT(BHU)CyberSec
Introduction to Cybersecurity | IIT(BHU)CyberSec
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
Event-Driven Microservices: Back to the Basics
Event-Driven Microservices: Back to the BasicsEvent-Driven Microservices: Back to the Basics
Event-Driven Microservices: Back to the Basics
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
Aggregating Ad Events with Kafka Streams and Interactive Queries at InvidiAggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
 
The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...
The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...
The Streaming Data Lake - What Do KIP-405 and KIP-833 Mean for Your Larger Da...
 
Tecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for RotogravureTecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for Rotogravure
 
Bridge to the Future: Migrating to KRaft
Bridge to the Future: Migrating to KRaftBridge to the Future: Migrating to KRaft
Bridge to the Future: Migrating to KRaft
 
Web Development Solutions 2024 A Beginner's Comprehensive Handbook.pdf
Web Development Solutions 2024 A Beginner's Comprehensive Handbook.pdfWeb Development Solutions 2024 A Beginner's Comprehensive Handbook.pdf
Web Development Solutions 2024 A Beginner's Comprehensive Handbook.pdf
 
Automation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementAutomation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions management
 
How to Build an Event-based Control Center for the Electrical Grid
How to Build an Event-based Control Center for the Electrical GridHow to Build an Event-based Control Center for the Electrical Grid
How to Build an Event-based Control Center for the Electrical Grid
 
CERN IoT Kafka Pipelines | Kafka Summit London
CERN IoT Kafka Pipelines | Kafka Summit LondonCERN IoT Kafka Pipelines | Kafka Summit London
CERN IoT Kafka Pipelines | Kafka Summit London
 
Error Handling with Kafka: From Patterns to Code
Error Handling with Kafka: From Patterns to CodeError Handling with Kafka: From Patterns to Code
Error Handling with Kafka: From Patterns to Code
 
The Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemThe Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data Ecosystem
 

Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012

  • 1. Fluentd ♥ MongoDB Log Everything As JSON Kazuki Ohta, CTO at Treasure Data, Inc. Tuesday, July 17, 2012
  • 2. Self-Introduction • Kazuki Ohta > twitter: @kzk_mover > github: kzk • Treasure Data, Inc. > Chief Technology Officer; Founder > Original Fluentd Author @frsyuki is another co-founder. • Open-Source Enthusiast > KDE, uim, Hadoop, memcached, Mozilla, Mongo, etc. > Fluentd rpm/deb package manager 2 Tuesday, July 17, 2012
  • 4. Figure 1: Common Logging Purposes Analytics Error Notification Recommendation 4 Tuesday, July 17, 2012
  • 5. Figure 2: Types of Logs App Log Access Log (Apache, Rails, etc.) System Log (syslog etc.) Others 5 Tuesday, July 17, 2012
  • 6. From “Scaling Lessons learned at Dropbox” 6 Tuesday, July 17, 2012
  • 7. Fragile for format change, No type information, No field name, etc. From “Scaling Lessons learned at Dropbox” 6 Tuesday, July 17, 2012
  • 9. It's like syslogd, but uses JSON for log messages 8 Tuesday, July 17, 2012
  • 10. Logs in JSON? Why? 1. Machine-Readable > machine is goint to be a main consumer of logs 2. Schema-Free > you want to add/remove fields from logs at anytime Write Logs for Machines, use JSON http://journal.paul.querna.org/articles/2011/12/26/log-for-machines-in-json/ 9 Tuesday, July 17, 2012
  • 11. Logs As TEXT Logs As JSON + Field Name + No Custom Parser + Type Information + Schema Free 10 Tuesday, July 17, 2012
  • 12. Logs As TEXT “2011-04-01 host1 myapp: cmessage size=12MB user=me” Logs As JSON 2011-04-01 myapp.message { “on_host”: ”host1”, ”combined”: true, “size”: 12000000, + Field Name “user”: “me” + No Custom Parser + Type Information } + Schema Free 10 Tuesday, July 17, 2012
  • 13. http://fluentd.org/ 11 Tuesday, July 17, 2012
  • 14. Website > http://fluentd.org/ • Community > http://github.com/fluent > 16 committers across many organizations > web, game, enterprise • Mailing list > Google groups 12 Tuesday, July 17, 2012
  • 16. Fluentd: Log Format Application Fluentd Storage 14 Tuesday, July 17, 2012
  • 17. Fluentd: Log Format Application 2012-02-04 01:33:51 myapp.buylog { Fluentd “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing” Storage } 14 Tuesday, July 17, 2012
  • 18. Fluentd: Log Format time Application tag 2012-02-04 01:33:51 myapp.buylog { Fluentd “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing” Storage } record 14 Tuesday, July 17, 2012
  • 19. Fluentd: Plugins Application filter / buffer / Fluentd routing Storage 15 Tuesday, July 17, 2012
  • 20. Fluentd: Plugins Application filter / buffer / Fluentd routing SaaS Storage Fluentd Plug-in Plug-in Plug-in 15 Tuesday, July 17, 2012
  • 21. Fluentd: Plugins Application filter / buffer / Fluentd routing SaaS Storage Fluentd Plug-in Plug-in Plug-in 16 Tuesday, July 17, 2012
  • 22. Fluentd: Plugins syslogd Scribe Application File Plug-in tail Plug-in Plug-in filter / buffer / Fluentd routing SaaS Storage Fluentd Plug-in Plug-in Plug-in 16 Tuesday, July 17, 2012
  • 23. Client libraries > Ruby > Perl Application Buffering > PHP HTTP / TCP / UDS > Python > Java Fluentd > ... 17 Tuesday, July 17, 2012
  • 24. Client libraries > Ruby > Perl Application Buffering > PHP HTTP / TCP / UDS > Python > Java Fluentd > ... Fluent.open(“myapp”) Fluent.event(“login”, {“user”=>38}) #=> 2012-02-04 04:56:01 myapp.login {“user”:38} 17 Tuesday, July 17, 2012
  • 25. Typical Log Collection by `rsync` Burst of traffic rsync consumes all bandwidth 18 Tuesday, July 17, 2012
  • 26. Typical Log Collection by `rsync` App server App server App server Application Application Application File File File ... File File File ... File File File ... File Burst of traffic High latency rsync consumes must wait for a day all bandwidth Log server Hard to analyze complex text parsers 18 Tuesday, July 17, 2012
  • 27. Log Collection using Fluentd Fluentd Fluentd Fluentd Realtime! Fluentd Fluentd 19 Tuesday, July 17, 2012
  • 28. Log Collection using Fluentd Fluentd Fluentd Fluentd Realtime! Fluentd Fluentd Amazon Ready to Hadoop Mongo S3 / / Hive DB EMR Analyze! 19 Tuesday, July 17, 2012
  • 29. Fluentd Case Study Ruby on Rails Ruby on Rails Ruby on Rails Fluentd Fluentd Fluentd ✓ 127 RoR servers ✓ 100,000 msgs/sec Fluentd Fluentd routing ✓ 120Mbps at peak ✓ 1TB/day Hadoop Mongo User behavior PV logs / Hive DB logs 20 Tuesday, July 17, 2012
  • 30. # read logs from a file # forward other logs to servers <source> # (load-balancing + fail-over) type tail <match **> path /var/log/httpd.log type forward format apache <server> tag apache.access host 192.168.0.11 </source> weight 20 </server> # save access logs to MongoDB <server> <match apache.access> host 192.168.0.12 type mongo weight 60 host 127.0.0.1 </server> </match> </match> Tuesday, July 17, 2012
  • 32. Scribe: log collector by Facebook Frontend servers Aggregator nodes scribe scribe scribe Hadoop HDFS scribe scribe scribe 23 Tuesday, July 17, 2012
  • 33. Scribe’s Pros & Cons • Pros. • Fast (written in C++) • Cons. • VERY HARD to install • nightmare of boost, thrift, libhdfs, etc. • Unstructured Logs • parsing must be required before the analysis • Hard to extend • recompiling C++ programs are required • No longer maintained 24 Tuesday, July 17, 2012
  • 34. Fluentd vs Scribe • Easy to install • “gem install fluentd” • Stable RPM and Deb packages • http://packages.treasure-data.com/ • Easy to write plugins • you can use Ruby • Easy plugin distribution • “gem search -rd fluent-plugin” 25 Tuesday, July 17, 2012
  • 35. Flume: distributed log collector by Cloudera Phisical Flume Master Topology Flume Flume Flume Logical Hadoop Topology HDFS 26 Tuesday, July 17, 2012
  • 36. Flume’s Pros & Cons • Pros. • Central master server manages all nodes • Cons. • Difficult to understand • logical topologies, phisical servers and a configuration of the logical/phisical mapping • Difficult to configure • replicated master servers, log servers and agents • Big footprint • 50,000 lines of Java 27 Tuesday, July 17, 2012
  • 37. Fluentd vs Flume • Easy to understand • “syslogd that understands JSON” • Easy to setup • “sudo fluentd --setup && fluentd” • Very small footprint • small engine (3,000) lines + plugins • small, but battle-tested! • Easy to configure 28 Tuesday, July 17, 2012
  • 38. Fluentd Scribe Flume Installation gem/rpm/deb make jar/rpm/deb 3000 lines of 8000 lines of 50,000 lines of Footprint Ruby C++ Java Plugin Ruby N/A Java Plugin distribution RubyGems.org N/A N/A Master Server No No Yes License Apache License Apache License Apache License 29 Tuesday, July 17, 2012
  • 40. fluent-plugin-mongo • Included within rpm/deb by default! • http://github.com/fluent/fluent-plugin-mongo • #1 plugin among 50+ Fluentd plugins • Logs As JSON. WHY NOT Put Them Into Mongo?? • http://fluentd.org/plugin/ • Supports most of the MongoDB features • Authentication • ReplicaSet • Capped Collection 31 Tuesday, July 17, 2012
  • 41. • MongoDB Output Plugin Application • Maintain JSON Structure • Reliable Buffering • Batch Insertion Fluentd Buffering • Handle Broken Records • Ruby Driver #82 Authentication MongoDB MongoDB MongoDB MongoDB MongoDB MongoDB Single Instance MongoDB MongoDB (Capped or Not) MongoDB MongoDB Sharding ReplicaSet 32 Tuesday, July 17, 2012
  • 42. • MongoDB Output Plugin Application • Maintain JSON Structure • Reliable Buffering • Batch Insertion Fluentd Buffering • Handle Broken Records • Ruby Driver #82 Authentication MongoDB MongoDB MongoDB MongoDB MongoDB MongoDB Single Instance MongoDB MongoDB (Capped or Not) MongoDB MongoDB Sharding ReplicaSet 32 Tuesday, July 17, 2012
  • 43. ReplicaSet (Capped Collection) Single Instance (Capped Collection) MongoDB MongoDB MongoDB MongoDB Authentication Fluentd Buffering • MongoDB Input Plugin • Tailing Capped Collections 33 Tuesday, July 17, 2012
  • 44. ReplicaSet (Capped Collection) Single Instance (Capped Collection) MongoDB MongoDB MongoDB MongoDB Authentication Fluentd Buffering • MongoDB Input Plugin • Tailing Capped Collections 33 Tuesday, July 17, 2012
  • 45. Realtime Analytics with Fluentd + MongoDB App App App Fluentd Fluentd Fluentd routing Fluentd Fluentd Nagios, Zabbix, etc. Mongo query Charting Alert DB 34 Tuesday, July 17, 2012
  • 46. Realtime or Batch? No, BOTH! App App App Fluentd Fluentd Fluentd routing Fluentd Fluentd Hadoop Amazon Mongo query Charting / Hive S3 DB batch archive realtime 35 Tuesday, July 17, 2012
  • 47. Intro of our company’s service: Treasure Data App App App Fluentd Fluentd Fluentd routing Fluentd Fluentd Treasure Mongo Hadoop-based Data DB Cloud Data Warehouse batch realtime 36 Tuesday, July 17, 2012
  • 48. Exercise: Apache Logs into MongoDB Tuesday, July 17, 2012
  • 49. Log File 38 Tuesday, July 17, 2012
  • 52. Conclusion • Log Everything as JSON • Machine Readability • Schema Freeness • MongoDB fits into Fluentd’s backend perfectly • Both using JSON representation 41 Tuesday, July 17, 2012