Treasure Data
                      The architecture of data analytics PaaS on AWS



                                    Masahiro Nakagawa

                                   JAWS Days: 2013/03/16




Friday, April 5, 13
Who are you?
          Masahiro Nakagawa
              • @repeatedly / masa@treasure-data.com


          Treasure Data, Inc.
              • Senior Software Engineer, since 2012/11

          Open Source projects
              •   D Programming Language
              •   MessagePack: D, Python, etc...
              •   Fluentd: Core, mongo, etc...
              •   etc...

                                                          2

Friday, April 5, 13
Introduction to
          Treasure Data




Friday, April 5, 13
Company Overview
          Silicon Valley-based Company
              • All Founders are Japanese
                      • Hironobu Yoshikawa
                      • Kazuki Ohta
                      • Sadayuki Furuhashi


          OSS Enthusiasts
              • MessagePack, Fluentd, etc.




                                             4

Friday, April 5, 13
Investors
             Bill Tai
             Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO
             Othman Laraki - Former VP Growth at Twitter
             James Lindenbaum, Adam Wiggins, Orion Henry - Heroku
              Founders
             Anand Babu Periasamy, Hitesh Chellani - Gluster Founders
             Yukihiro “Matz” Matsumoto - Creator of Ruby
             Dan Scheinman - Director of Arista Networks
             Jerry Yang - Founder of Yahoo!
             + 10 more people
              • and....
                                                                         5

Friday, April 5, 13
Treasure Data = Cloud + Big Data
     Cloud                                                                            Big Data-as-a-Service



                            Database-as-a-service




                                             Enterprise
                      Lightweight             RDBMS           Traditional
                        RDBMS                               Data Warehouse

                                                    DB2
  On-Premise
                                    $34B                                     $10B
                                    market                                   market


                                                          1Bil entry                             Data Volume
                                                          Or 10TB


          © 2012 Forrester Research, Inc. Reproduction Prohibited                                              6

Friday, April 5, 13
Why Cloud? ‘Time’ is Money
                             Ideal
    Customer              Expectation
     Value

                                                        Obsolete
                                                        over time


                                           Reality
                                        (On-Premise)


                                                             Upgrade
                      HW/SW Selection, PoC, Deploy...
                                                                       Time
      Sign-up or PO




                                                                         7

Friday, April 5, 13
Big Data Adoption Stages
                        Optimization           What’s the best?
                      Predictive Analysis      What’s a trend?     Analytics
                      Statistical Analysis         Treasure Data’s FOCUS
                                                    Why?
                            Alerts                  Error?(80% of needs)
                      Drill Down Query         Where exactly?
                                                                       Reporting
                      Ad-hoc Reports               Where?
                      Standard Reports         What happened?

                                     Intelligence Sophistication
                                                                               8

Friday, April 5, 13
Full Stack Support for Big Data Reporting

        Our best-in-class architecture       Data from almost any source
        and operations team ensure the       can be securely and reliably
        integrity and availability of your   uploaded using td-agent in
        data.                                streaming or batch mode.




        Our SQL, REST, JDBC, ODBC            You can store gigabytes to
        and command-line interfaces          petabytes of data efficiently and
        support all major query tools        securely in our cloud-based
        and approaches.                      columnar datastore.




                                                                       9

Friday, April 5, 13
Vision: Single Analytics Platform for the World
                                                                   10

Friday, April 5, 13
11

         Our Customers – Fortune Global 500 leaders and
         start-ups including:




Friday, April 5, 13
Treasure Data’s
          Service Architecture




Friday, April 5, 13
Treasure Data = Collect + Store + Query
                                                                13

Friday, April 5, 13
Example in AdTech: MobFox




           1. Europe’s largest independent mobile ad exchange.
           2. 20 billion imps/month (circa Jan. 2013)
           3. Serving ads for 15,000+ mobile apps (circa Jan. 2013)
           4. Needed Big Data Analytics infrastructure ASAP.

                                                                  14

Friday, April 5, 13
Two Weeks From Start to Finish!




                                                        15

Friday, April 5, 13
Used AWS Products (1)
          RDS
              • Store user information, job status, etc...
              • Store metadata of our columnar database
              • Queue of worker (perfectqueue / perfectsched)


          EC2
              • API servers
              • Hadoop clusters
              • Job workers
                      • Using Chef to deploy


                                                                16

Friday, April 5, 13
Used AWS Products (2)
          ELB
              • Load balancing of API servers
              • Load balancing of td-agents


          S3
              • Columnar storage built on top of S3
                      • MessagePack columnar format
                      • realtime / archive storage
              • Our Result feature supports S3 output.

                  No EMR, SQS and other products !
                                                         17

Friday, April 5, 13
Architecture Breakdown



      Data Collection             Data Store/Analytics        Connectivity
      • Increasing variety of     • Remaining complexity in   • Required to ensure
        data sources                both traditional DWH        connectivity with
      • No single data schema       and Hadoop (very slow       existing BI/visualization/
      • Lack of streaming data      time to market)             apps by JDBC, REST
        collection method         • Challenges in scaling       and ODBC.
      • 60% of Big Data project     data volume and           • Output ot other services,
        resource consumed           expanding cost.             e.g. S3, RDBMS, etc.




                                                                                         18

Friday, April 5, 13
1) Data Collection
          60% of BI project resource is consumed here
          Most ‘underestimated’ and ‘unsexy’ but MOST important
          Fluentd: OSS lightweight but robust Log Collector
              • http://fluentd.org/




                                                               19

Friday, April 5, 13
Fluentd
                      the missing log collector



                               fluentd.org

                                                  20

Friday, April 5, 13
In short
             Open sourced log collector written in Ruby
             Using rubygems ecosystem for plugins



                  It’s like syslogd, but
              uses JSON for log messages

                                                           21

Friday, April 5, 13
Time       2012-02-04 01:33:51
        Apache                                                               Tag          apache.log
                                                                            Record {
                                                                                       "host": "127.0.0.1",
                                                                        tail           "method": "GET",
                                                                                       "path": "/",
                       write                                                           ...
                                                                                   }

                                                                                             insert
  127.0.0.1
  127.0.0.1
  127.0.0.1
              -
              -
              -
                  -
                  -
                  -
                      [11/Dec/2012:07:26:27]
                      [11/Dec/2012:07:26:30]
                      [11/Dec/2012:07:26:32]
                                               "GET
                                               "GET
                                               "GET
                                                      /
                                                      /
                                                      /
                                                          ...
                                                          ...
                                                          ...
                                                                       Fluentd
  127.0.0.1   -   -   [11/Dec/2012:07:26:40]   "GET   /   ...
  127.0.0.1   -   -   [11/Dec/2012:07:27:01]   "GET   /   ...
                               ...




                                                                 event
                                                                buffering
                                                                                       Mongo
                                                                                                         22

Friday, April 5, 13
Architecture
             Pluggable     Pluggable   Pluggable



                  Input     Buffer     Output

             > Forward     > Memory    > Forward
             > HTTP        > File      > File
             > File tail               > Amazon S3
             > dstat                   > MongoDB
             > ...                     > ...

                                                     23

Friday, April 5, 13
Before Fluentd
              Server1           Server2               Server3

          Application         Application           Application


                        ・・・               ・・・                    ・・・




                                                High Latency!
                                                must wait for a day...
                               Fluent
                              Log Server
                                                                  24

Friday, April 5, 13
After Fluentd
              Server1                Server2              Server3

          Application            Application             Application


               Fluentd   ・・・         Fluentd   ・・・        Fluentd   ・・・




                                                     In streaming!

                           Fluentd             Fluentd

                                                                       25

Friday, April 5, 13
Access logs                                   Alerting
     Apache                                        Nagios

    App logs                                      Analysis
     Frontend                                      MongoDB
     Backend
                                                   MySQL

    System logs                                    Hadoop
      syslogd
                      filter / buffer / routing
                                                  Archiving
    Databases                                      Amazon S3
                                                             26

Friday, April 5, 13
td-agent
             Open sourced distribution package of fluentd
             ETL part of Treasure Data
             Including useful components
                 • ruby, jemalloc, fluentd
                 • 3rd party gems: td, mongo, webhdfs, etc...
                      •   td plugin is for Treasure Data

             http://packages.treasure-data.com/



                                                                27

Friday, April 5, 13
Treasure Data Service Architecture
                                                                 This!

                  Apache

                      App                                                        Treasure Data
                                              td-agent                           columnar data
                      App       RDBMS                                             warehouse

                  Other data sources

                                                                                        MAPREDUCE JOBS

                                         HIVE, PIG (to be supported)
                            td-command
                                                                                      Query
                                                                         Query
                                                                                      Processing
                                                                          API
                                         JDBC, REST                                   Cluster
            User             BI apps




                                                                                                    28

Friday, April 5, 13
AWS plugins
             S3
             SNS
             SQS
             DynamoDB
             foward-aws
             RDS                       http://fluentd.org/plugin/
             RedShift
             CloudWatch
             Yet Another Cloud Watch
             CloudWatch Lite

                                                                29

Friday, April 5, 13
2) Data Store / Analytics - Columnar Storage




                                                    30

Friday, April 5, 13
Treasure Data Service Processing Flow
                                                Worker
             Frontend
                                    Job Queue                     Hadoop




                                                                  Hadoop


              Applications push
              metrics to Fluentd
                                                               sums up data minutes
              (via local Fluentd)    Fluentd    Fluentd         (partial aggregation)



                      Treasure
                                                          Librato Metrics
                          Data
         for historical analysis                           for realtime analysis

                                                                                        31

Friday, April 5, 13
Friday, April 5, 13
Structure of Columnar Storages

               import             bulk import                     SELECT ...



            Import Storage         Bulk Import Storage


                             Realtime Storage              Archive Storage

                                                         merge (every 1 hour)

                         23c82b0ba3405d4c15aa85d2190e     2013-03-15 00:23:00 912ec80
                         6d7b1482412ab14f0332b8aee119     2013-03-16 00:01:00 277a259
                         8a7bc848b2791b8fd603c719e54f                   ...
                         0e3d402b17638477c9a7977e7dab
                                     ...



                                                                                        33

Friday, April 5, 13
Query Language




                      Query Execution




                      Columnar Data




                      Object Storage




                                 34

Friday, April 5, 13
1/4: Compile SQL into MapReduce

                         SQL Statement
                                  SELECT COUNT(DISTINCT ip) FROM tbl;



                              Hive
                      SQL - to - MapReduce




                                                                   35

Friday, April 5, 13
2/4: MapReduce is executed in parallel

                                                           SELECT COUNT(DISTINCT ip) FROM tbl;




                      cc2.8xlarge cluster compute instance (up to 100 nodes * 32 threads)



                                                                                                 36

Friday, April 5, 13
3/4: Columnar Data Access

                                                              SELECT COUNT(DISTINCT ip) FROM tbl;




                      10Gbps Network




                                       Read ONLY the Required Part of Data


                                                                                                    37

Friday, April 5, 13
4/4: Object-based Storage




                                     38

Friday, April 5, 13
Data first, Schema later


            SELECT           54 (int)    “test” (string)        120 (int)         NULL




            Schema           user:int        name:string       value:int        host:int




            Raw data(JSON)   {“user”:54, “name”:”test”, “value”:”120”, “host”:”local”}




                                                                                           39

Friday, April 5, 13
3) Connectivity

                                   REST API
                      td-command
                                                                 Query
                                                       Query
      Query                                             API
                                                                 Processing
                                   JDBC, ODBC Driver             Cluster
                       BI apps




                       Web App
                                                           Treasure Data
         Result         MySQL                             Columnar Storage

                         S3
                        …




                                                                              40

Friday, April 5, 13
Multi-Tenancy
    All customers share the Hadoop clusters (Multi Data Centers)
    Resource Sharing (Burst Cores), Rapid Improvement, Ease of Upgrade

                                                                       Job Submission
                                                                       + Plan Change
                                     Local FairScheduler

                      datacenter A

                                     Local FairScheduler
                                                               Global
                      datacenter B
                                                              Scheduler
                                     Local FairScheduler

                      datacenter C                            On-Demand
                                                           Resouce Allocation
                                     Local FairScheduler
                      datacenter D


                                                                                  41

Friday, April 5, 13
Conclusion
          Treasure Data
              • Cloud based Big-data analytics platform
              • Provide Machete for Big data reporting

          Big Data processing
              • Collect / Store / Analytics / Visualization
                       Our focus!
          Our used AWS products
              • EC2, S3, RDS, ELB
              • Building Treasure Data specific systems on AWS


                                                                 42

Friday, April 5, 13
Big Data for the Rest of Us

                      www.treasure-data.com | @TreasureData




Friday, April 5, 13

The architecture of data analytics PaaS on AWS

  • 1.
    Treasure Data The architecture of data analytics PaaS on AWS Masahiro Nakagawa JAWS Days: 2013/03/16 Friday, April 5, 13
  • 2.
    Who are you?  Masahiro Nakagawa • @repeatedly / masa@treasure-data.com  Treasure Data, Inc. • Senior Software Engineer, since 2012/11  Open Source projects • D Programming Language • MessagePack: D, Python, etc... • Fluentd: Core, mongo, etc... • etc... 2 Friday, April 5, 13
  • 3.
    Introduction to Treasure Data Friday, April 5, 13
  • 4.
    Company Overview  Silicon Valley-based Company • All Founders are Japanese • Hironobu Yoshikawa • Kazuki Ohta • Sadayuki Furuhashi  OSS Enthusiasts • MessagePack, Fluentd, etc. 4 Friday, April 5, 13
  • 5.
    Investors  Bill Tai  Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO  Othman Laraki - Former VP Growth at Twitter  James Lindenbaum, Adam Wiggins, Orion Henry - Heroku Founders  Anand Babu Periasamy, Hitesh Chellani - Gluster Founders  Yukihiro “Matz” Matsumoto - Creator of Ruby  Dan Scheinman - Director of Arista Networks  Jerry Yang - Founder of Yahoo!  + 10 more people • and.... 5 Friday, April 5, 13
  • 6.
    Treasure Data =Cloud + Big Data Cloud Big Data-as-a-Service Database-as-a-service Enterprise Lightweight RDBMS Traditional RDBMS Data Warehouse DB2 On-Premise $34B $10B market market 1Bil entry Data Volume Or 10TB © 2012 Forrester Research, Inc. Reproduction Prohibited 6 Friday, April 5, 13
  • 7.
    Why Cloud? ‘Time’is Money Ideal Customer Expectation Value Obsolete over time Reality (On-Premise) Upgrade HW/SW Selection, PoC, Deploy... Time Sign-up or PO 7 Friday, April 5, 13
  • 8.
    Big Data AdoptionStages Optimization What’s the best? Predictive Analysis What’s a trend? Analytics Statistical Analysis Treasure Data’s FOCUS Why? Alerts Error?(80% of needs) Drill Down Query Where exactly? Reporting Ad-hoc Reports Where? Standard Reports What happened? Intelligence Sophistication 8 Friday, April 5, 13
  • 9.
    Full Stack Supportfor Big Data Reporting Our best-in-class architecture Data from almost any source and operations team ensure the can be securely and reliably integrity and availability of your uploaded using td-agent in data. streaming or batch mode. Our SQL, REST, JDBC, ODBC You can store gigabytes to and command-line interfaces petabytes of data efficiently and support all major query tools securely in our cloud-based and approaches. columnar datastore. 9 Friday, April 5, 13
  • 10.
    Vision: Single AnalyticsPlatform for the World 10 Friday, April 5, 13
  • 11.
    11 Our Customers – Fortune Global 500 leaders and start-ups including: Friday, April 5, 13
  • 12.
    Treasure Data’s Service Architecture Friday, April 5, 13
  • 13.
    Treasure Data =Collect + Store + Query 13 Friday, April 5, 13
  • 14.
    Example in AdTech:MobFox 1. Europe’s largest independent mobile ad exchange. 2. 20 billion imps/month (circa Jan. 2013) 3. Serving ads for 15,000+ mobile apps (circa Jan. 2013) 4. Needed Big Data Analytics infrastructure ASAP. 14 Friday, April 5, 13
  • 15.
    Two Weeks FromStart to Finish! 15 Friday, April 5, 13
  • 16.
    Used AWS Products(1)  RDS • Store user information, job status, etc... • Store metadata of our columnar database • Queue of worker (perfectqueue / perfectsched)  EC2 • API servers • Hadoop clusters • Job workers • Using Chef to deploy 16 Friday, April 5, 13
  • 17.
    Used AWS Products(2)  ELB • Load balancing of API servers • Load balancing of td-agents  S3 • Columnar storage built on top of S3 • MessagePack columnar format • realtime / archive storage • Our Result feature supports S3 output. No EMR, SQS and other products ! 17 Friday, April 5, 13
  • 18.
    Architecture Breakdown Data Collection Data Store/Analytics Connectivity • Increasing variety of • Remaining complexity in • Required to ensure data sources both traditional DWH connectivity with • No single data schema and Hadoop (very slow existing BI/visualization/ • Lack of streaming data time to market) apps by JDBC, REST collection method • Challenges in scaling and ODBC. • 60% of Big Data project data volume and • Output ot other services, resource consumed expanding cost. e.g. S3, RDBMS, etc. 18 Friday, April 5, 13
  • 19.
    1) Data Collection  60% of BI project resource is consumed here  Most ‘underestimated’ and ‘unsexy’ but MOST important  Fluentd: OSS lightweight but robust Log Collector • http://fluentd.org/ 19 Friday, April 5, 13
  • 20.
    Fluentd the missing log collector fluentd.org 20 Friday, April 5, 13
  • 21.
    In short  Open sourced log collector written in Ruby  Using rubygems ecosystem for plugins It’s like syslogd, but uses JSON for log messages 21 Friday, April 5, 13
  • 22.
    Time 2012-02-04 01:33:51 Apache Tag apache.log Record { "host": "127.0.0.1", tail "method": "GET", "path": "/", write ... } insert 127.0.0.1 127.0.0.1 127.0.0.1 - - - - - - [11/Dec/2012:07:26:27] [11/Dec/2012:07:26:30] [11/Dec/2012:07:26:32] "GET "GET "GET / / / ... ... ... Fluentd 127.0.0.1 - - [11/Dec/2012:07:26:40] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:27:01] "GET / ... ... event buffering Mongo 22 Friday, April 5, 13
  • 23.
    Architecture Pluggable Pluggable Pluggable Input Buffer Output > Forward > Memory > Forward > HTTP > File > File > File tail > Amazon S3 > dstat > MongoDB > ... > ... 23 Friday, April 5, 13
  • 24.
    Before Fluentd Server1 Server2 Server3 Application Application Application ・・・ ・・・ ・・・ High Latency! must wait for a day... Fluent Log Server 24 Friday, April 5, 13
  • 25.
    After Fluentd Server1 Server2 Server3 Application Application Application Fluentd ・・・ Fluentd ・・・ Fluentd ・・・ In streaming! Fluentd Fluentd 25 Friday, April 5, 13
  • 26.
    Access logs Alerting Apache Nagios App logs Analysis Frontend MongoDB Backend MySQL System logs Hadoop syslogd filter / buffer / routing Archiving Databases Amazon S3 26 Friday, April 5, 13
  • 27.
    td-agent  Open sourced distribution package of fluentd  ETL part of Treasure Data  Including useful components • ruby, jemalloc, fluentd • 3rd party gems: td, mongo, webhdfs, etc... • td plugin is for Treasure Data  http://packages.treasure-data.com/ 27 Friday, April 5, 13
  • 28.
    Treasure Data ServiceArchitecture This! Apache App Treasure Data td-agent columnar data App RDBMS warehouse Other data sources MAPREDUCE JOBS HIVE, PIG (to be supported) td-command Query Query Processing API JDBC, REST Cluster User BI apps 28 Friday, April 5, 13
  • 29.
    AWS plugins  S3  SNS  SQS  DynamoDB  foward-aws  RDS http://fluentd.org/plugin/  RedShift  CloudWatch  Yet Another Cloud Watch  CloudWatch Lite 29 Friday, April 5, 13
  • 30.
    2) Data Store/ Analytics - Columnar Storage 30 Friday, April 5, 13
  • 31.
    Treasure Data ServiceProcessing Flow Worker Frontend Job Queue Hadoop Hadoop Applications push metrics to Fluentd sums up data minutes (via local Fluentd) Fluentd Fluentd (partial aggregation) Treasure Librato Metrics Data for historical analysis for realtime analysis 31 Friday, April 5, 13
  • 32.
  • 33.
    Structure of ColumnarStorages import bulk import SELECT ... Import Storage Bulk Import Storage Realtime Storage Archive Storage merge (every 1 hour) 23c82b0ba3405d4c15aa85d2190e 2013-03-15 00:23:00 912ec80 6d7b1482412ab14f0332b8aee119 2013-03-16 00:01:00 277a259 8a7bc848b2791b8fd603c719e54f ... 0e3d402b17638477c9a7977e7dab ... 33 Friday, April 5, 13
  • 34.
    Query Language Query Execution Columnar Data Object Storage 34 Friday, April 5, 13
  • 35.
    1/4: Compile SQLinto MapReduce SQL Statement SELECT COUNT(DISTINCT ip) FROM tbl; Hive SQL - to - MapReduce 35 Friday, April 5, 13
  • 36.
    2/4: MapReduce isexecuted in parallel SELECT COUNT(DISTINCT ip) FROM tbl; cc2.8xlarge cluster compute instance (up to 100 nodes * 32 threads) 36 Friday, April 5, 13
  • 37.
    3/4: Columnar DataAccess SELECT COUNT(DISTINCT ip) FROM tbl; 10Gbps Network Read ONLY the Required Part of Data 37 Friday, April 5, 13
  • 38.
    4/4: Object-based Storage 38 Friday, April 5, 13
  • 39.
    Data first, Schemalater SELECT 54 (int) “test” (string) 120 (int) NULL Schema user:int name:string value:int host:int Raw data(JSON) {“user”:54, “name”:”test”, “value”:”120”, “host”:”local”} 39 Friday, April 5, 13
  • 40.
    3) Connectivity REST API td-command Query Query Query API Processing JDBC, ODBC Driver Cluster BI apps Web App Treasure Data Result MySQL Columnar Storage S3 … 40 Friday, April 5, 13
  • 41.
    Multi-Tenancy  All customers share the Hadoop clusters (Multi Data Centers)  Resource Sharing (Burst Cores), Rapid Improvement, Ease of Upgrade Job Submission + Plan Change Local FairScheduler datacenter A Local FairScheduler Global datacenter B Scheduler Local FairScheduler datacenter C On-Demand Resouce Allocation Local FairScheduler datacenter D 41 Friday, April 5, 13
  • 42.
    Conclusion  Treasure Data • Cloud based Big-data analytics platform • Provide Machete for Big data reporting  Big Data processing • Collect / Store / Analytics / Visualization Our focus!  Our used AWS products • EC2, S3, RDS, ELB • Building Treasure Data specific systems on AWS 42 Friday, April 5, 13
  • 43.
    Big Data forthe Rest of Us www.treasure-data.com | @TreasureData Friday, April 5, 13