SlideShare a Scribd company logo
Treasure Data
                      Hadoop meets Cloud with Multi-Tenancy


                                     Kazuki Ohta
                         Founder and CTO at Treasure Data, Inc.
                                  Hadoopユーザー会
                                     k@treasure-data.com

                                         @kzk_mover




Friday, April 5, 13
Who are you?
          Kazuki Ohta (太田一樹)
              • @kzk_mover, k@treasure-data.com


          Treasure Data, Inc.
              • Chief Technology Officer, Founded July 2011

          Hadoop User Group Japan
              • One of Founders
              • “Hadoop徹底入門”


          Open-Source Enthusiast
              • Hadoop, memcached, jemalloc, MongoDB, memcached, uim, etc...

                                                                               2

Friday, April 5, 13
Treasure Data = Cloud + Big Data
     Cloud                                                                            Big Data-as-a-Service



                            Database-as-a-service




                                             Enterprise
                      Lightweight             RDBMS           Traditional
                        RDBMS                               Data Warehouse

                                                    DB2
  On-Premise
                                    $34B                                     $10B
                                    market                                   market


                                                          1Bil entry                             Data Volume
                                                          Or 10TB


          © 2012 Forrester Research, Inc. Reproduction Prohibited                                              3

Friday, April 5, 13
What is the Problem?




                                             4

Friday, April 5, 13
Big Data? NoSQL?




                            5

Friday, April 5, 13
Too Many Solutions




                                           6

Friday, April 5, 13
Hadoop Versions




                      Too Many Variations (+Eco System)




                           from http://marblejenka.blogspot.jp/2013/01/hadoop.html   7

Friday, April 5, 13
Current Big Data Solutions: ‘Feature Creep’




                      http://en.wikipedia.org/wiki/Feature_creep   8

Friday, April 5, 13
We need Machete :)

                      EVERYTHING
                          with
                      ONE interface


                                         Simple & Discoverable
                                      Machete Design by James Lindenbaum
                                      Heroku Co-Founder
                                      http://www.youtube.com/watch?v=3BhDLm9jo5Y

                                                                           9

Friday, April 5, 13
‘Simplicity’ itself is a feature :)
                      by Anand Babu Periasamy
                       GlusterFS Co-Founder



                                                 10

Friday, April 5, 13
Next Topic: Cloud?




                                           11

Friday, April 5, 13
http://www.saasblogs.com/saas/demystifying-the-cloud-where-do-saas-paas-and-other-acronyms-fit-in/


                                                                                                                          12

Friday, April 5, 13
Battle Field of IaaS Vendors: SCM

          HW Performance / Price                     In the near future, most of
                                                     HW buyers aren’t individual
                                                     companies, but cloud.
                                      IaaS Vendors




        Decrease with                                        Battle Field:
        Moore’s Law                                   Supply Chain Management
                         On-Premise

                                                                         Time



                                                                                13

Friday, April 5, 13
PaaS, SaaS:
                       IT is all about Operation


                      More Sleep, More Value

          With PaaS, you offload your development operations function and
          have the PaaS provider handle the tools and components required to
          deploy and manage applications reliably. - EngineYard



                                                                           14

Friday, April 5, 13
PaaS/SaaS Battle Field: ‘Time’ is Money
                             Ideal
    Customer              Expectation
     Value

                                                        Obsolete
                                                        over time


                                           Reality
                                        (On-Premise)


                                                             Upgrade
                      HW/SW Selection, PoC, Deploy...
                                                                       Time
      Sign-up or PO




                                                                         15

Friday, April 5, 13
Introduction
                            to
                      Treasure Data


                                      16

Friday, April 5, 13
Company Overview




                        US team as of 2012 July   17

Friday, April 5, 13
Company Overview
          Silicon Valley-based Company
              • All Founders are Japanese
                      • Hironobu Yoshikawa
                      • Kazuki Ohta
                      • Sadayuki Furuhashi


          OSS Enthusiasts
              • MessagePack, Fluentd, etc.
              • Cloud native



                                             18

Friday, April 5, 13
19

         Our 50+ Customers – Fortune Global 500 leaders
         and start-ups including:




                      250 billion records / month
                               in Feb 2013

                        2 million jobs executed




Friday, April 5, 13
Vision: Single Analytics Platform for the World
                                                                   20

Friday, April 5, 13
Investors
             Bill Tai
             Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO
             Othman Laraki - Former VP Growth at Twitter
             James Lindenbaum, Adam Wiggins, Orion Henry - Heroku
              Founders
             Anand Babu Periasamy, Hitesh Chellani - Gluster
              Founders
             Yukihiro “Matz” Matsumoto - Creator of Ruby
                          Jerry Yang, Founder of Yahoo!
             Dan Scheinman - Director of Arista Networks
                                 where Hadoop was invented :)
             + 10 more people
                    Check out Today (2013/01/21)’s Morning 日経新聞!
              • and....
                                                                    21

Friday, April 5, 13
Treasure Data’s
                      Philosophy and Architecture




                                                    22

Friday, April 5, 13
Big Data Adoption Stages
                        Optimization           What’s the best?
                      Predictive Analysis      What’s a trend?     Analytics
                      Statistical Analysis         Treasure Data’s FOCUS
                                                    Why?
                            Alerts                  Error?(80% of needs)
                      Drill Down Query         Where exactly?
                                                                       Reporting
                      Ad-hoc Reports               Where?
                      Standard Reports         What happened?

                                     Intelligence Sophistication
                                                                               23

Friday, April 5, 13
Full Stack Support for Big Data Reporting

        Our best-in-class architecture       Data from almost any source
        and operations team ensure the       can be securely and reliably
        integrity and availability of your   uploaded using td-agent in
        data.                                streaming or batch mode.




        Our SQL, REST, JDBC, ODBC            You can store gigabytes to
        and command-line interfaces          petabytes of data efficiently and
        support all major query tools        securely in our cloud-based
        and approaches.                      columnar datastore.




                                                                      24

Friday, April 5, 13
Treasure Data = Collect + Store + Query
                                                                25

Friday, April 5, 13
Example in AdTech: MobFox




           1. Europe’s largest independent mobile ad exchange.
           2. 20 billion imps/month (circa Jan. 2013)
           3. Serving ads for 15,000+ mobile apps (circa Jan. 2013)
           4. Needed Big Data Analytics infrastructure ASAP.

                                                                  26

Friday, April 5, 13
Two Weeks From Start to Finish!




                                                        27

Friday, April 5, 13
Our Value was Proven :)
    Customer             Our Value: Save Time!
     Value

                                                           Obsolete
                                                           over time


                                               Reality
                                            (On-Premise)
                                 Simple
                                Interface
                                                                Upgrade
                      HW/SW Selection, PoC, Deploy...
                                                                          Time
      Sign-up or PO




                                                                            28

Friday, April 5, 13
Architecture Breakdown



      Data Collection             Data Store/Analytics        Connectivity
      • Increasing variety of     • Remaining complexity in   • Required to ensure
        data sources                both traditional DWH        connectivity with
      • No single data schema       and Hadoop (very slow       existing BI/visualization/
      • Lack of streaming data      time to market)             apps by JDBC, REST
        collection method         • Challenges in scaling       and ODBC.
      • 60% of Big Data project     data volume and
        resource consumed           expanding cost.




                                                                                             29

Friday, April 5, 13
1) Data Collection
          60% of BI project resource is consumed here
          Most ‘underestimated’ and ‘unsexy’ but MOST important
          Fluentd: OSS lightweight but robust Log Collector
              • http://fluentd.org/

                                           These talks will cover Fluentd :)
                                      15:40∼ Log analysis system with Hadoop in livedoor 2013
                                        by Satoshi Tagomori @ NHN Japan

                                      16:30∼ いかにしてHadoopにデータを集めるか
                                        by Sadayuki Furuhahsi @ Treasure Data, Inc.



                                                                                        30

Friday, April 5, 13
2) Data Store / Analytics - Columnar Storage




                                                    31

Friday, April 5, 13
3) Connectivity

                                   REST API
                      td-command
                                                                 Query
                                                       Query
      Query                                             API
                                                                 Processing
                                   JDBC, ODBC Driver             Cluster
                       BI apps




                       Web App
                                                           Treasure Data
         Result         MySQL                             Columnar Storage

                       Postgres




                                                                              32

Friday, April 5, 13
Most Difficult Challenge: Multi-Tenancy
    All customers share the Hadoop clusters (4 Data Centers)
    Resource Sharing (Burst Cores), Rapid Improvement, Ease of Upgrade

                                                                       Job Submission
                                                                       + Plan Change
                                     Local FairScheduler

                      datacenter A

                                     Local FairScheduler
                                                               Global
                      datacenter B
                                                              Scheduler
                                     Local FairScheduler

                      datacenter C                            On-Demand
                                                           Resouce Allocation
                                     Local FairScheduler
                      datacenter D


                                                                                  33

Friday, April 5, 13
Conclusion
          Big Data is too complex
              • Needs Simplicity
              • Machete v.s. Swiss Army Knife (Feature Creep)

          IT is changing
              • The value of Software itself is decreasing
              • Operation is the key

          Treasure Data = Cloud + Big Data
              • Currently Focusing on Big Data Reporting
              • Instant Value with Simple Interface

                                                                34

Friday, April 5, 13
We’re Hiring Top Talents, please contact me :)
                                                                       35

Friday, April 5, 13
Appendix




                      18 36
Friday, April 5, 13
Big Data Market Growth
         (average of IDC, Gartner and Wikibon stats)               Big Data Revenue Breakdown




                      CAGR 38%




                                                                     “In 2012…BI and Analytics are
                                                                      rated #1 priorities.”
                                                                         — Ravi Kalakota, Gartner

                                                            “Big Data is the new definitive source of
     “More than half a billion dollars in venture capital
                                                             competitive advantage across all
      has been invested in new big data technology.”         industries.”
              — Dan Vessett, IDC                             — Jeff Kelly, Wikibon

                                                                                                 37

Friday, April 5, 13
Big Data Situation


  Customer
                      Treasure Data
   Value
                                                            RedShift
                                               AWS
                                                                             Obsolescence
                                                                               over time

                                                      EMR



                                         Software B



                            Software A                          On-premise
                                                                 solutions
                                                                                            Time
    Sign-up or PO


                                                                                               38

Friday, April 5, 13
Treasure Data Service Architecture
                           User

             Apache

              App                                                        Treasure Data
                                                                         columnar data
              App         RDBMS
                                                                          warehouse

             Other data sources

                                                                                MAPREDUCE JOBS

                                   HIVE, PIG (to be supported)
                      td-command
                                                                              Query
                                                                 Query
                                                                              Processing
                                                                  API
                                   JDBC, REST                                 Cluster
                       BI apps



                                                                                           39

Friday, April 5, 13
Our Own Open Source technologies
   We are open source natives and proud of our heritage.
   We’ve contributed to Hibernate, Hadoop, Cassandra,
   Memcached, KDE, MongoDB among others.
   Our product reflects our deep commitment to the open-source
   community and is built on top of open source software we’ve
   authored and open sourced.
   •   Fluentd - a popular data collector daemon written in Ruby
                  www.fluentd.org (a leading user: SlideShare/Linkedin, One Kings Lane)
   •   MessagePack - a fast, compact serializer.
       www.msgpack.org (a leading user: Pinterest, Redis)



                                 Substantial commitment
                             (Code, Packaging, Documentation,
                                       Sponsorship)




                            Tech marketing, Possible lead gen




                                                                                          40

Friday, April 5, 13
Example in Web Industry




                                                41

Friday, April 5, 13
Example Use Case – MySQL to TD




                                          42

Friday, April 5, 13
Example Use Case – MySQL to TD




                                          43

Friday, April 5, 13
Big Data for the Rest of Us

                      www.treasure-data.com | @TreasureData




Friday, April 5, 13

More Related Content

Viewers also liked

Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...
Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...
Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...
Denodo
 
STAC Summit 2014 - Building a multitenant Big Data infrastructure
STAC Summit 2014 - Building a multitenant Big Data infrastructureSTAC Summit 2014 - Building a multitenant Big Data infrastructure
STAC Summit 2014 - Building a multitenant Big Data infrastructure
Gord Sissons
 
Data-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture RequirementsData-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture Requirements
DATAVERSITY
 
Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...
Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...
Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...
Felix Gessert
 
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
StampedeCon
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environmentSasha Citino
 
Toward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFSToward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFS
DataWorks Summit/Hadoop Summit
 
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamIoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
gogo6
 
The big data value chain r1-31 oct13
The big data value chain r1-31 oct13The big data value chain r1-31 oct13
The big data value chain r1-31 oct13
Rei Lynn Hayashi
 
Filling the Data Lake
Filling the Data LakeFilling the Data Lake
Filling the Data Lake
DataWorks Summit/Hadoop Summit
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
Hortonworks
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
DataWorks Summit
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
Hortonworks
 
Building the enterprise data architecture
Building the enterprise data architectureBuilding the enterprise data architecture
Building the enterprise data architectureCosta Pissaris
 
Enterprise Master Data Architecture
Enterprise Master Data ArchitectureEnterprise Master Data Architecture
Enterprise Master Data Architecture
Boris Otto
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data GovernanceDATAVERSITY
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
Alexey Grishchenko
 

Viewers also liked (20)

Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...
Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...
Denodo Data Virtualization Platform architecture: Data Discovery and Data Gov...
 
STAC Summit 2014 - Building a multitenant Big Data infrastructure
STAC Summit 2014 - Building a multitenant Big Data infrastructureSTAC Summit 2014 - Building a multitenant Big Data infrastructure
STAC Summit 2014 - Building a multitenant Big Data infrastructure
 
Data-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture RequirementsData-Ed Online: Data Architecture Requirements
Data-Ed Online: Data Architecture Requirements
 
Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...
Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...
Building a Global-Scale Multi-Tenant Cloud Platform on AWS and Docker: Lesson...
 
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environment
 
Toward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFSToward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFS
 
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamIoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
 
The big data value chain r1-31 oct13
The big data value chain r1-31 oct13The big data value chain r1-31 oct13
The big data value chain r1-31 oct13
 
Filling the Data Lake
Filling the Data LakeFilling the Data Lake
Filling the Data Lake
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
 
Building the enterprise data architecture
Building the enterprise data architectureBuilding the enterprise data architecture
Building the enterprise data architecture
 
Enterprise Master Data Architecture
Enterprise Master Data ArchitectureEnterprise Master Data Architecture
Enterprise Master Data Architecture
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 

Similar to Hadoop meets Cloud with Multi-Tenancy

Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
Treasure Data, Inc.
 
Cloud Foundry the Open PaaS - OpenTour Austin Keynote
Cloud Foundry the Open PaaS - OpenTour Austin KeynoteCloud Foundry the Open PaaS - OpenTour Austin Keynote
Cloud Foundry the Open PaaS - OpenTour Austin KeynotePatrick Chanezon
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
Rodney Joyce
 
Prince Building Tech Talk 12102012
Prince Building Tech Talk 12102012Prince Building Tech Talk 12102012
Prince Building Tech Talk 12102012Andy Parsons
 
IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...
IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...
IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...
In-Memory Computing Summit
 
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Makoto Yui
 
CloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heavenCloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heaven
Patrick Chanezon
 
10 concepts the enterprise decision maker needs to understand about Hadoop
10 concepts the enterprise decision maker needs to understand about Hadoop10 concepts the enterprise decision maker needs to understand about Hadoop
10 concepts the enterprise decision maker needs to understand about Hadoop
Donald Miner
 
Rhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The Cloud
Rhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The CloudRhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The Cloud
Rhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The Cloud
Cloudera, Inc.
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
David Portnoy
 
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL & NoSQL: No Longer an Either-or QuestionHadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Tony Baer
 
Hadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or questionHadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or questionDataWorks Summit
 
8 douetteau - dataiku - data tuesday open source 26 fev 2013
8   douetteau - dataiku - data tuesday open source 26 fev 2013 8   douetteau - dataiku - data tuesday open source 26 fev 2013
8 douetteau - dataiku - data tuesday open source 26 fev 2013 Data Tuesday
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemallMakoto Yui
 
Lightning Fast Dataframes with Polars
Lightning Fast Dataframes with PolarsLightning Fast Dataframes with Polars
Lightning Fast Dataframes with Polars
Alberto Danese
 
Hadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data FrameworkHadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data Framework
Alaina Carter
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
Jonathan Seidman
 
Where The Cloud Things Are
Where The Cloud Things AreWhere The Cloud Things Are
Where The Cloud Things Are
Shunsaku Kudo
 
Backendasaservice apiomat
Backendasaservice apiomatBackendasaservice apiomat
Backendasaservice apiomatHeinrich Seeger
 

Similar to Hadoop meets Cloud with Multi-Tenancy (20)

Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
 
Cloud Foundry the Open PaaS - OpenTour Austin Keynote
Cloud Foundry the Open PaaS - OpenTour Austin KeynoteCloud Foundry the Open PaaS - OpenTour Austin Keynote
Cloud Foundry the Open PaaS - OpenTour Austin Keynote
 
20100301icde
20100301icde20100301icde
20100301icde
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Prince Building Tech Talk 12102012
Prince Building Tech Talk 12102012Prince Building Tech Talk 12102012
Prince Building Tech Talk 12102012
 
IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...
IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...
IMCSummit 2015 - Day 1 Developer Track - Open-Source In-Memory Platforms: Ben...
 
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17
 
CloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heavenCloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heaven
 
10 concepts the enterprise decision maker needs to understand about Hadoop
10 concepts the enterprise decision maker needs to understand about Hadoop10 concepts the enterprise decision maker needs to understand about Hadoop
10 concepts the enterprise decision maker needs to understand about Hadoop
 
Rhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The Cloud
Rhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The CloudRhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The Cloud
Rhat OSS - Cloudera - Mike Olson - Hadoop Data Analytics In The Cloud
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL & NoSQL: No Longer an Either-or QuestionHadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL & NoSQL: No Longer an Either-or Question
 
Hadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or questionHadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or question
 
8 douetteau - dataiku - data tuesday open source 26 fev 2013
8   douetteau - dataiku - data tuesday open source 26 fev 2013 8   douetteau - dataiku - data tuesday open source 26 fev 2013
8 douetteau - dataiku - data tuesday open source 26 fev 2013
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemall
 
Lightning Fast Dataframes with Polars
Lightning Fast Dataframes with PolarsLightning Fast Dataframes with Polars
Lightning Fast Dataframes with Polars
 
Hadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data FrameworkHadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data Framework
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Where The Cloud Things Are
Where The Cloud Things AreWhere The Cloud Things Are
Where The Cloud Things Are
 
Backendasaservice apiomat
Backendasaservice apiomatBackendasaservice apiomat
Backendasaservice apiomat
 

More from Treasure Data, Inc.

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for Marketers
Treasure Data, Inc.
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and Market
Treasure Data, Inc.
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
Treasure Data, Inc.
 
Hands On: Javascript SDK
Hands On: Javascript SDKHands On: Javascript SDK
Hands On: Javascript SDK
Treasure Data, Inc.
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Treasure Data, Inc.
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Treasure Data, Inc.
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with Data
Treasure Data, Inc.
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without Data
Treasure Data, Inc.
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data Dots
Treasure Data, Inc.
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company Success
Treasure Data, Inc.
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017
Treasure Data, Inc.
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
Treasure Data, Inc.
 
Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14
Treasure Data, Inc.
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of Hivemall
Treasure Data, Inc.
 
Scalable Hadoop in the cloud
Scalable Hadoop in the cloudScalable Hadoop in the cloud
Scalable Hadoop in the cloud
Treasure Data, Inc.
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
Treasure Data, Inc.
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data, Inc.
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
Treasure Data, Inc.
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
Treasure Data, Inc.
 

More from Treasure Data, Inc. (20)

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for Marketers
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and Market
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
 
Hands On: Javascript SDK
Hands On: Javascript SDKHands On: Javascript SDK
Hands On: Javascript SDK
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with Data
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without Data
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data Dots
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company Success
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
 
Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of Hivemall
 
Scalable Hadoop in the cloud
Scalable Hadoop in the cloudScalable Hadoop in the cloud
Scalable Hadoop in the cloud
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 

Hadoop meets Cloud with Multi-Tenancy

  • 1. Treasure Data Hadoop meets Cloud with Multi-Tenancy Kazuki Ohta Founder and CTO at Treasure Data, Inc. Hadoopユーザー会 k@treasure-data.com @kzk_mover Friday, April 5, 13
  • 2. Who are you?  Kazuki Ohta (太田一樹) • @kzk_mover, k@treasure-data.com  Treasure Data, Inc. • Chief Technology Officer, Founded July 2011  Hadoop User Group Japan • One of Founders • “Hadoop徹底入門”  Open-Source Enthusiast • Hadoop, memcached, jemalloc, MongoDB, memcached, uim, etc... 2 Friday, April 5, 13
  • 3. Treasure Data = Cloud + Big Data Cloud Big Data-as-a-Service Database-as-a-service Enterprise Lightweight RDBMS Traditional RDBMS Data Warehouse DB2 On-Premise $34B $10B market market 1Bil entry Data Volume Or 10TB © 2012 Forrester Research, Inc. Reproduction Prohibited 3 Friday, April 5, 13
  • 4. What is the Problem? 4 Friday, April 5, 13
  • 5. Big Data? NoSQL? 5 Friday, April 5, 13
  • 6. Too Many Solutions 6 Friday, April 5, 13
  • 7. Hadoop Versions Too Many Variations (+Eco System) from http://marblejenka.blogspot.jp/2013/01/hadoop.html 7 Friday, April 5, 13
  • 8. Current Big Data Solutions: ‘Feature Creep’ http://en.wikipedia.org/wiki/Feature_creep 8 Friday, April 5, 13
  • 9. We need Machete :) EVERYTHING with ONE interface Simple & Discoverable Machete Design by James Lindenbaum Heroku Co-Founder http://www.youtube.com/watch?v=3BhDLm9jo5Y 9 Friday, April 5, 13
  • 10. ‘Simplicity’ itself is a feature :) by Anand Babu Periasamy GlusterFS Co-Founder 10 Friday, April 5, 13
  • 11. Next Topic: Cloud? 11 Friday, April 5, 13
  • 13. Battle Field of IaaS Vendors: SCM HW Performance / Price In the near future, most of HW buyers aren’t individual companies, but cloud. IaaS Vendors Decrease with Battle Field: Moore’s Law Supply Chain Management On-Premise Time 13 Friday, April 5, 13
  • 14. PaaS, SaaS: IT is all about Operation More Sleep, More Value With PaaS, you offload your development operations function and have the PaaS provider handle the tools and components required to deploy and manage applications reliably. - EngineYard 14 Friday, April 5, 13
  • 15. PaaS/SaaS Battle Field: ‘Time’ is Money Ideal Customer Expectation Value Obsolete over time Reality (On-Premise) Upgrade HW/SW Selection, PoC, Deploy... Time Sign-up or PO 15 Friday, April 5, 13
  • 16. Introduction to Treasure Data 16 Friday, April 5, 13
  • 17. Company Overview US team as of 2012 July 17 Friday, April 5, 13
  • 18. Company Overview  Silicon Valley-based Company • All Founders are Japanese • Hironobu Yoshikawa • Kazuki Ohta • Sadayuki Furuhashi  OSS Enthusiasts • MessagePack, Fluentd, etc. • Cloud native 18 Friday, April 5, 13
  • 19. 19 Our 50+ Customers – Fortune Global 500 leaders and start-ups including: 250 billion records / month in Feb 2013 2 million jobs executed Friday, April 5, 13
  • 20. Vision: Single Analytics Platform for the World 20 Friday, April 5, 13
  • 21. Investors  Bill Tai  Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO  Othman Laraki - Former VP Growth at Twitter  James Lindenbaum, Adam Wiggins, Orion Henry - Heroku Founders  Anand Babu Periasamy, Hitesh Chellani - Gluster Founders  Yukihiro “Matz” Matsumoto - Creator of Ruby Jerry Yang, Founder of Yahoo!  Dan Scheinman - Director of Arista Networks where Hadoop was invented :)  + 10 more people Check out Today (2013/01/21)’s Morning 日経新聞! • and.... 21 Friday, April 5, 13
  • 22. Treasure Data’s Philosophy and Architecture 22 Friday, April 5, 13
  • 23. Big Data Adoption Stages Optimization What’s the best? Predictive Analysis What’s a trend? Analytics Statistical Analysis Treasure Data’s FOCUS Why? Alerts Error?(80% of needs) Drill Down Query Where exactly? Reporting Ad-hoc Reports Where? Standard Reports What happened? Intelligence Sophistication 23 Friday, April 5, 13
  • 24. Full Stack Support for Big Data Reporting Our best-in-class architecture Data from almost any source and operations team ensure the can be securely and reliably integrity and availability of your uploaded using td-agent in data. streaming or batch mode. Our SQL, REST, JDBC, ODBC You can store gigabytes to and command-line interfaces petabytes of data efficiently and support all major query tools securely in our cloud-based and approaches. columnar datastore. 24 Friday, April 5, 13
  • 25. Treasure Data = Collect + Store + Query 25 Friday, April 5, 13
  • 26. Example in AdTech: MobFox 1. Europe’s largest independent mobile ad exchange. 2. 20 billion imps/month (circa Jan. 2013) 3. Serving ads for 15,000+ mobile apps (circa Jan. 2013) 4. Needed Big Data Analytics infrastructure ASAP. 26 Friday, April 5, 13
  • 27. Two Weeks From Start to Finish! 27 Friday, April 5, 13
  • 28. Our Value was Proven :) Customer Our Value: Save Time! Value Obsolete over time Reality (On-Premise) Simple Interface Upgrade HW/SW Selection, PoC, Deploy... Time Sign-up or PO 28 Friday, April 5, 13
  • 29. Architecture Breakdown Data Collection Data Store/Analytics Connectivity • Increasing variety of • Remaining complexity in • Required to ensure data sources both traditional DWH connectivity with • No single data schema and Hadoop (very slow existing BI/visualization/ • Lack of streaming data time to market) apps by JDBC, REST collection method • Challenges in scaling and ODBC. • 60% of Big Data project data volume and resource consumed expanding cost. 29 Friday, April 5, 13
  • 30. 1) Data Collection  60% of BI project resource is consumed here  Most ‘underestimated’ and ‘unsexy’ but MOST important  Fluentd: OSS lightweight but robust Log Collector • http://fluentd.org/ These talks will cover Fluentd :) 15:40∼ Log analysis system with Hadoop in livedoor 2013 by Satoshi Tagomori @ NHN Japan 16:30∼ いかにしてHadoopにデータを集めるか by Sadayuki Furuhahsi @ Treasure Data, Inc. 30 Friday, April 5, 13
  • 31. 2) Data Store / Analytics - Columnar Storage 31 Friday, April 5, 13
  • 32. 3) Connectivity REST API td-command Query Query Query API Processing JDBC, ODBC Driver Cluster BI apps Web App Treasure Data Result MySQL Columnar Storage Postgres 32 Friday, April 5, 13
  • 33. Most Difficult Challenge: Multi-Tenancy  All customers share the Hadoop clusters (4 Data Centers)  Resource Sharing (Burst Cores), Rapid Improvement, Ease of Upgrade Job Submission + Plan Change Local FairScheduler datacenter A Local FairScheduler Global datacenter B Scheduler Local FairScheduler datacenter C On-Demand Resouce Allocation Local FairScheduler datacenter D 33 Friday, April 5, 13
  • 34. Conclusion  Big Data is too complex • Needs Simplicity • Machete v.s. Swiss Army Knife (Feature Creep)  IT is changing • The value of Software itself is decreasing • Operation is the key  Treasure Data = Cloud + Big Data • Currently Focusing on Big Data Reporting • Instant Value with Simple Interface 34 Friday, April 5, 13
  • 35. We’re Hiring Top Talents, please contact me :) 35 Friday, April 5, 13
  • 36. Appendix 18 36 Friday, April 5, 13
  • 37. Big Data Market Growth (average of IDC, Gartner and Wikibon stats) Big Data Revenue Breakdown CAGR 38% “In 2012…BI and Analytics are rated #1 priorities.” — Ravi Kalakota, Gartner “Big Data is the new definitive source of “More than half a billion dollars in venture capital competitive advantage across all has been invested in new big data technology.” industries.” — Dan Vessett, IDC — Jeff Kelly, Wikibon 37 Friday, April 5, 13
  • 38. Big Data Situation Customer Treasure Data Value RedShift AWS Obsolescence over time EMR Software B Software A On-premise solutions Time Sign-up or PO 38 Friday, April 5, 13
  • 39. Treasure Data Service Architecture User Apache App Treasure Data columnar data App RDBMS warehouse Other data sources MAPREDUCE JOBS HIVE, PIG (to be supported) td-command Query Query Processing API JDBC, REST Cluster BI apps 39 Friday, April 5, 13
  • 40. Our Own Open Source technologies We are open source natives and proud of our heritage. We’ve contributed to Hibernate, Hadoop, Cassandra, Memcached, KDE, MongoDB among others. Our product reflects our deep commitment to the open-source community and is built on top of open source software we’ve authored and open sourced. • Fluentd - a popular data collector daemon written in Ruby www.fluentd.org (a leading user: SlideShare/Linkedin, One Kings Lane) • MessagePack - a fast, compact serializer. www.msgpack.org (a leading user: Pinterest, Redis) Substantial commitment (Code, Packaging, Documentation, Sponsorship) Tech marketing, Possible lead gen 40 Friday, April 5, 13
  • 41. Example in Web Industry 41 Friday, April 5, 13
  • 42. Example Use Case – MySQL to TD 42 Friday, April 5, 13
  • 43. Example Use Case – MySQL to TD 43 Friday, April 5, 13
  • 44. Big Data for the Rest of Us www.treasure-data.com | @TreasureData Friday, April 5, 13

Editor's Notes

  1. ContextLogic – Company behind Wish.com – Rapidly growing social rewards platform. MobFox – Europe’s largest mobile advertising network. Cookpad – Japan’s largest recipe discovery service. Getjar – World’s largest free app store. Viki.com – Global video streaming & sharing site. Splurgy – Socially enabled universal promotions platform.