WELCOME
Conference Highlights

    • Four exciting keynotes
    • Lots networking opportunities
    • Sixty educational sessions




2                  ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                  Reproduction or redistribution without written permission is
                                          prohibited.
Thank You Sponsors
      PLATINUM SPONSORS                                                             GOLD SPONSORS




       SILVER SPONSORS                                                           BRONZE SPONSORS




3                         ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                         Reproduction or redistribution without written permission is
                                                 prohibited.
Housekeeping Items

    • Connecting to the internet
      – Wireless network = Sheraton Meeting
      – Code = Vertica
    • Hashtag = #hw2011
    • Take the surveys
      – Breakout sessions
      – Overall survey



4                   ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                   Reproduction or redistribution without written permission is
                                           prohibited.
Mike Olson
Chief Executive Officer
Cloudera
Three Years Ago…

    We said: Hadoop is going to be huge.


This year’s conference:
    • 1,400 people from 580 companies in
      27 countries and 40 of the United States
    • 75.7% attending Hadoop World for the
      first time
    • 71.9% using Hadoop
    • 66.5% engineers, developers and
      architects, 33.5% non-technical
      business roles
    • Just over 50 of you are “data scientists”




6                                  ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                                  Reproduction or redistribution without written permission is
                                                          prohibited.
Three Years Ago…

    We said: Hadoop is going to be huge.


Your Hadoop usage:
    • Less than one year: 36.8%
    • One to two years: 32.3%
    • Two to three years: 16.8%
    • More than three years: 12%
    • Average usage is 17.4 months this year,
      versus 8.76 months at last year’s
      Hadoop World




7                                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                                Reproduction or redistribution without written permission is
                                                        prohibited.
Three Years Ago…

    We said: Hadoop is going to be huge.


Your clusters:
    • Average size is 120 nodes, up from
      66 last year
    • 44% between 10 and 100 nodes, 52%
      between 100 and 1,000 nodes
    • Total of 202 petabytes under management
      (60 last year)
    • Largest cluster bigger than 20PB
    • 13.1% bigger than 100TB
    • 12.8% bigger than 1PB
                                                                                               2010   2011


8                                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                                Reproduction or redistribution without written permission is
                                                        prohibited.
Two Years Ago…
We said: Hadoop is at the center of a new platform for big data.                           • Hadoop
                                                                                           • HBase
                                                                                           • Pig
                                                                                           • Zookeeper
                                                                                           • Mahout
                                                                                           • Hive
                                                                                           • Avro
                                                                                           • Whirr
                                                                                           • Sqoop
                                                                                           • Hcatalog
                                                                                           • MRUnit
                                                                                           • Bigtop
                                                                                           • Oozie


9                            ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                            Reproduction or redistribution without written permission is
                                                    prohibited.
Two Years Ago…

  We said: Hadoop is at the center of a new platform for big data.
              100%             100%

Core
                                                                                         58%
Hadoop                                                     37%                                            37%                31%
as % of
New
Contribs
             2006             2007                       2008                          2009              2010               2011
            • Core Hadoop   • Core Hadoop            •   Core Hadoop              •   Core Hadoop   •   Core Hadoop   •   Core Hadoop
                                                     •   HBase                    •   HBase         •   HBase         •   HBase
                                                     •   Zookeeper                •   Pig           •   Pig           •   Pig
                                                     •   Mahout                   •   Zookeeper     •   Zookeeper     •   Zookeeper
Relevant                                                                          •   Mahout        •   Mahout        •   Mahout
Projects                                                                          •   Hive          •   Hive          •   Hive
                                                                                                    •   Avro          •   Avro
                                                                                                    •   Whirr         •   Whirr
                                                                                                    •   Sqoop         •   Sqoop
                                                                                                                      •   Bigtop
                                                                                                                      •   …



   10                               ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                                   Reproduction or redistribution without written permission is
                                                           prohibited.
Last Year…
                                                      We said : Hadoop must integrate with
                                                      data center infrastructure and tools.
                                                         •     Enterprises need software and
                                                               support that de-risk and simplify the
                                                               operation of Hadoop in production

                                                         •     Must build on the open source
                                                               platform to deliver all the innovation
        Hadoop                                                 and value created by the global Apache
       Operations                                              Hadoop ecosystem




11                   ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                    Reproduction or redistribution without written permission is
                                            prohibited.
12    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
13    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
14    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
15    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
16    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
17    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
18    ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
     Reproduction or redistribution without written permission is
                             prohibited.
Last Year…
                                                               We said : Hadoop must integrate with
                                                               data center infrastructure and tools.

     OPERATORS                                   ENGINEERS                  ANALYSTS             BUSINESS USERS




     Management                                                                                     Enterprise
                                                    IDE’s                 BI / Analytics
        Tools                                                                                       Reporting




                                                                                                                  CUSTOMERS
                                                                                            Enterprise Data
                                                                                             Warehouse


                                                                                                                    Web
                                                                                                                  Application



                                                 Relational
        Logs      Files   Web Data
                                                 Databases




19                            ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                             Reproduction or redistribution without written permission is
                                                     prohibited.
This Year…

We’re talking about the future.




20                       ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                        Reproduction or redistribution without written permission is
                                                prohibited.
Building Applications
                                        Develop personalized
                                        applications on Hadoop
                                        and HBase
                                        Get it at:
                                        http://fonedoktor.com

                                          Learn more about
                                          Today, 3:30PM,
                                          Architecture Track
  Battery Analysis   Mapping Features     Aaron Kimball and
 Available Today…     Coming Soon!        Garrett Wu

www.wibidata.com, @wibidata
Data Analysis and Visualization                                     INSTANT INTELLIGENCE




  Demand for Online
    App Analytics
• Real-time, interactive &
  visual analytics
• Auto-discover data trends
• User behavior analytics with
  data clustering
• Investigative and root cause
  analytics
• Simplify data modeling &
  custom functions for Hadoop
  data

Empower business users, data scientists without-of-the-box analytics


 www.cetas.net, @CetasAnalytics
Powerful Statistical Tools

• Why Hadoop and R?
  •   Need to do more than simple statistics
  •   Analyze all of the data

• Integration
  •   Make it easy to write MapReduce programs in R
  •   Keep the statisticians focused on the analysis

  Usage
  •   Fraud and Risk Analysis
  •   Portfolio Optimization
  •   Anything you can model in R!




 www.revolutionanalytics.com, @RevolutionR
Complex Data Exploration

                                      Automatic extraction of facts,
                        Who
                                     connections, associations, etc.
                      Relationship
                                      Who

  Association
                                      Connections

                                      Aliases                Entity
                                                               :
                          Alias       Where                   AIG

                                      When
   Location
                                      What

                       Time
                                        Synthesys Knowledge
                                               Base
                    What did..

                                                                       Connection discovered from AIG to
                                                                           Metlife Equity in Wikipedia:
   Unstructured Data                                                    AIG sells Allco to Metlife Equity
                                                                                    for $6.8B
                Synthesys automatically surfaces critical
                      facts in unstructured data


www.digitalreasoning.com, @dreasoning
Business Analytics
• Metrics Management and Reporting
• Strategic, Financial, and Operational Planning, Budgeting, and Forecasting
• Profitability Modeling



           USABLE


           UNIFIED


        ACTIONABLE

 Enterprise Performance Management
             for the Cloud


www.tidemark.net, @TidemarkEPM
An Exploding, Diverse Ecosystem




26           ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
            Reproduction or redistribution without written permission is
                                    prohibited.
| BE FIRST



Big Data Fund
Hadoop World — November 2011
Big Data Fund
• $100MM dedicated to fund entrepreneurs globally in building disruptive, Big
  Data companies
• Funding innovation across every layer of the “Big Data Stack”:
                                 Infrastructure                     •                Applications
                                                                        Business Intelligence
                 •   Automation                                     •   Collaboration
                 •   Data Management                                •   Data Analysis/Visualization
                 •   Identity & Access                              •   Mobile
                 •   Security                                       •   Vertical Applications
                 •   Storage                                        •   …
                 •   …


• Partnering with thought leaders to foster community and drive innovation:




  Doug Cutting       Gil Elbaz      Jeff Hammerbacher   Jeff Heer          Hilary Mason        Jay Parikh   Kenny Van Zant
      Hadoop          Factual            Cloudera       Stanford               Bit.ly          Facebook       Solarwinds


Accel Partners                                                                                                             28
Who We Are

 Three decades of technology investing with over $6B of capital in US, Europe,
 China and India
           • Partner with category-defining entrepreneurs
           • Invest at every stage of technology lifecycle – seed, venture and growth capital
           • Focus deeply on technology innovations in software, infrastructure and internet

     Big Data consistently drives innovation across our portfolio companies today


                   Data Generators                                 Data Solutions




Accel Partners                                                                                  29
Time is Now!

                          The Big Data Wave                       Data is exploding

                                                                  “New” data types are
                                                                   breaking legacy data
     Data Growth




                                                                   platforms

                                                                  Big Data platforms such
                                                                   as Hadoop are becoming
                                                                   mainstream

                   1980     1990                2000    2010      “Native” Big Data
                          Traditional Data   Big Data
                                                                   applications and services
                                                                   will quickly emerge



     Big Data continues to revolutionize data centers across all industries, opening
                  up a massive market for entrepreneurial activity.
Accel Partners                                                                            30
Funding the Big Data Ecosystem
Big Data will drive the next-generation of multi-billion dollar software companies

                               1980 - 2010                          2010 and beyond

                                                               Analytics              Security
                                                                                      Business
Applications




                                                             Collaboration           Intelligence

                                                                Mobile                  CRM

                                                              Vertical Apps: Fin Tech, Healthcare


                                                                    Big Data Platforms
                     Traditional Data Platforms
Data




                    Relational Database Management Systems
Infrastructure




                 Traditional Infrastructure Platforms             Private & Public Cloud
                         Mainframe, Client-Server, Web             Platform and Services


Accel Partners                                                                                      31
Big Data Fund Contact Info
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
                                                                Contact Us
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
                                                              accel.com/bigdata
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel bigdatafund@accel.com
                                                              Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
                                                              @bigdatafund
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
                                           Big Data Conference - Spring 2012
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
                                                              Want to attend or speak?
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
                                                            BigData2012@accel.com
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
                                    Stay on top of the latest big data news from Accel Partners by finding us on
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
                                                               facebook.com/Accel
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
                                                                 @Accel_Partners
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
      Accel Partners                                                                                                                 32
The Next-Generation Data Center

                                                 Systems
               Web                                Logs                                   Real-time
              Servers                                                                     Feeds
 Trading
 Systems                                                                                             Sensors




Enterprise                                                                                            Sales
  Data                                                                                               Systems
Warehouse                                                                                             People




             Document
             Repository                      ERP System                                    CRM




33                         ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                          Reproduction or redistribution without written permission is
                                                  prohibited.
The Future

                   Tackling Critical Business Issues
                                                                                                   Better targeted
                          Better and deeper                                                        medicines with fewer
                          understanding of risk                                                    complications and
                          to avoid credit crisis.                                                  side effects.
     Financial Services                                                  Life Sciences


                                                                                                   A personal experience
                    More reliable                                                                  with products and offers
                    networks where we                                                              that are just what
                    can predict and                                                                you need.
 Telecommunications                                                             Retail
                    prevent failure.

                          More content that is                                                     Government services
                          lined up with your                                                       that are based on hard
                          personal preferences.                                                    data, not just gut.
          Media                                                           Government




34                                   ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                                    Reproduction or redistribution without written permission is
                                                            prohibited.
Thank You
    Thanks you

Hadoop World 2011: Mike Olson Keynote Presentation

  • 1.
  • 2.
    Conference Highlights • Four exciting keynotes • Lots networking opportunities • Sixty educational sessions 2 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 3.
    Thank You Sponsors PLATINUM SPONSORS GOLD SPONSORS SILVER SPONSORS BRONZE SPONSORS 3 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 4.
    Housekeeping Items • Connecting to the internet – Wireless network = Sheraton Meeting – Code = Vertica • Hashtag = #hw2011 • Take the surveys – Breakout sessions – Overall survey 4 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 5.
  • 6.
    Three Years Ago… We said: Hadoop is going to be huge. This year’s conference: • 1,400 people from 580 companies in 27 countries and 40 of the United States • 75.7% attending Hadoop World for the first time • 71.9% using Hadoop • 66.5% engineers, developers and architects, 33.5% non-technical business roles • Just over 50 of you are “data scientists” 6 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 7.
    Three Years Ago… We said: Hadoop is going to be huge. Your Hadoop usage: • Less than one year: 36.8% • One to two years: 32.3% • Two to three years: 16.8% • More than three years: 12% • Average usage is 17.4 months this year, versus 8.76 months at last year’s Hadoop World 7 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 8.
    Three Years Ago… We said: Hadoop is going to be huge. Your clusters: • Average size is 120 nodes, up from 66 last year • 44% between 10 and 100 nodes, 52% between 100 and 1,000 nodes • Total of 202 petabytes under management (60 last year) • Largest cluster bigger than 20PB • 13.1% bigger than 100TB • 12.8% bigger than 1PB 2010 2011 8 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 9.
    Two Years Ago… Wesaid: Hadoop is at the center of a new platform for big data. • Hadoop • HBase • Pig • Zookeeper • Mahout • Hive • Avro • Whirr • Sqoop • Hcatalog • MRUnit • Bigtop • Oozie 9 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 10.
    Two Years Ago… We said: Hadoop is at the center of a new platform for big data. 100% 100% Core 58% Hadoop 37% 37% 31% as % of New Contribs 2006 2007 2008 2009 2010 2011 • Core Hadoop • Core Hadoop • Core Hadoop • Core Hadoop • Core Hadoop • Core Hadoop • HBase • HBase • HBase • HBase • Zookeeper • Pig • Pig • Pig • Mahout • Zookeeper • Zookeeper • Zookeeper Relevant • Mahout • Mahout • Mahout Projects • Hive • Hive • Hive • Avro • Avro • Whirr • Whirr • Sqoop • Sqoop • Bigtop • … 10 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 11.
    Last Year… We said : Hadoop must integrate with data center infrastructure and tools. • Enterprises need software and support that de-risk and simplify the operation of Hadoop in production • Must build on the open source platform to deliver all the innovation Hadoop and value created by the global Apache Operations Hadoop ecosystem 11 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 12.
    12 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 13.
    13 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 14.
    14 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 15.
    15 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 16.
    16 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 17.
    17 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 18.
    18 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 19.
    Last Year… We said : Hadoop must integrate with data center infrastructure and tools. OPERATORS ENGINEERS ANALYSTS BUSINESS USERS Management Enterprise IDE’s BI / Analytics Tools Reporting CUSTOMERS Enterprise Data Warehouse Web Application Relational Logs Files Web Data Databases 19 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 20.
    This Year… We’re talkingabout the future. 20 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 21.
    Building Applications Develop personalized applications on Hadoop and HBase Get it at: http://fonedoktor.com Learn more about Today, 3:30PM, Architecture Track Battery Analysis Mapping Features Aaron Kimball and Available Today… Coming Soon! Garrett Wu www.wibidata.com, @wibidata
  • 22.
    Data Analysis andVisualization INSTANT INTELLIGENCE Demand for Online App Analytics • Real-time, interactive & visual analytics • Auto-discover data trends • User behavior analytics with data clustering • Investigative and root cause analytics • Simplify data modeling & custom functions for Hadoop data Empower business users, data scientists without-of-the-box analytics www.cetas.net, @CetasAnalytics
  • 23.
    Powerful Statistical Tools •Why Hadoop and R? • Need to do more than simple statistics • Analyze all of the data • Integration • Make it easy to write MapReduce programs in R • Keep the statisticians focused on the analysis Usage • Fraud and Risk Analysis • Portfolio Optimization • Anything you can model in R! www.revolutionanalytics.com, @RevolutionR
  • 24.
    Complex Data Exploration Automatic extraction of facts, Who connections, associations, etc. Relationship Who Association Connections Aliases Entity : Alias Where AIG When Location What Time Synthesys Knowledge Base What did.. Connection discovered from AIG to Metlife Equity in Wikipedia: Unstructured Data AIG sells Allco to Metlife Equity for $6.8B Synthesys automatically surfaces critical facts in unstructured data www.digitalreasoning.com, @dreasoning
  • 25.
    Business Analytics • MetricsManagement and Reporting • Strategic, Financial, and Operational Planning, Budgeting, and Forecasting • Profitability Modeling USABLE UNIFIED ACTIONABLE Enterprise Performance Management for the Cloud www.tidemark.net, @TidemarkEPM
  • 26.
    An Exploding, DiverseEcosystem 26 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 27.
    | BE FIRST BigData Fund Hadoop World — November 2011
  • 28.
    Big Data Fund •$100MM dedicated to fund entrepreneurs globally in building disruptive, Big Data companies • Funding innovation across every layer of the “Big Data Stack”: Infrastructure • Applications Business Intelligence • Automation • Collaboration • Data Management • Data Analysis/Visualization • Identity & Access • Mobile • Security • Vertical Applications • Storage • … • … • Partnering with thought leaders to foster community and drive innovation: Doug Cutting Gil Elbaz Jeff Hammerbacher Jeff Heer Hilary Mason Jay Parikh Kenny Van Zant Hadoop Factual Cloudera Stanford Bit.ly Facebook Solarwinds Accel Partners 28
  • 29.
    Who We Are Three decades of technology investing with over $6B of capital in US, Europe, China and India • Partner with category-defining entrepreneurs • Invest at every stage of technology lifecycle – seed, venture and growth capital • Focus deeply on technology innovations in software, infrastructure and internet Big Data consistently drives innovation across our portfolio companies today Data Generators Data Solutions Accel Partners 29
  • 30.
    Time is Now! The Big Data Wave  Data is exploding  “New” data types are breaking legacy data Data Growth platforms  Big Data platforms such as Hadoop are becoming mainstream 1980 1990 2000 2010  “Native” Big Data Traditional Data Big Data applications and services will quickly emerge Big Data continues to revolutionize data centers across all industries, opening up a massive market for entrepreneurial activity. Accel Partners 30
  • 31.
    Funding the BigData Ecosystem Big Data will drive the next-generation of multi-billion dollar software companies 1980 - 2010 2010 and beyond Analytics Security Business Applications Collaboration Intelligence Mobile CRM Vertical Apps: Fin Tech, Healthcare Big Data Platforms Traditional Data Platforms Data Relational Database Management Systems Infrastructure Traditional Infrastructure Platforms Private & Public Cloud Mainframe, Client-Server, Web Platform and Services Accel Partners 31
  • 32.
    Big Data FundContact Info Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners Contact Us ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ accel.com/bigdata Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel bigdatafund@accel.com Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ @bigdatafund Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Big Data Conference - Spring 2012 Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners Want to attend or speak? ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big BigData2012@accel.com Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Stay on top of the latest big data news from Accel Partners by finding us on Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners facebook.com/Accel ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data @Accel_Partners Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Accel Partners 32
  • 33.
    The Next-Generation DataCenter Systems Web Logs Real-time Servers Feeds Trading Systems Sensors Enterprise Sales Data Systems Warehouse People Document Repository ERP System CRM 33 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 34.
    The Future Tackling Critical Business Issues Better targeted Better and deeper medicines with fewer understanding of risk complications and to avoid credit crisis. side effects. Financial Services Life Sciences A personal experience More reliable with products and offers networks where we that are just what can predict and you need. Telecommunications Retail prevent failure. More content that is Government services lined up with your that are based on hard personal preferences. data, not just gut. Media Government 34 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 35.
    Thank You Thanks you

Editor's Notes

  • #7 We ran a survey.1400 people, 580 countries27 countries and 40 of the United StatesMore than 3/4 are first-timers at Hadoop World – Welcome!Nearly 3/4 are using Hadoop today2/3 technical, 1/3 businessAnd the new profession of data science is here in force!
  • #8 One third each: Less than one year, 1-2 years, more than two years.The average user here is more experienced than the average user at Hadoop World 2010 – 9 months
  • #9 Average cluster size has doubled in a year.More than half of you have pretty big clusters – more than 100 nodes.202 PB represented on our survey. One company was 10% of that.More of you – 12% -- above a petabyte than I would have guessed.But important: About 3/4 of you have less than 100TB in Hadoop.
  • #10 Hadoop needed more:Load and share dataQuery tools and ways to schedule and manage obsFast record storage and retrievalAll of that is available from the Apache ecossytem
  • #11 In 2006 and 2007, all the work was on core Hadoop.2008, the ecosystem began to diversify.Today, nearly 70% of all new contribs are to surrounding projects – only 31% to Hadoop itselfWhat you would expect as platform has matured
  • #12 Hadoop in production is just one part of your data center.You need to monitor and manage like other critical platforms.
  • #13 What’s happening right now?Who’s doing what?
  • #14 How are the services I depend on doing?
  • #15 I need a high-level service view.Take storage.How is it performing?Latency? Throughput?What’s happening?
  • #16 Who’s consuming storage?Am I close to capacity?How to I make sure users get what they need?How do I track their use?
  • #17 Infrastructure is long-lived.I need to add, remove, retire hardware.I can’t shut down the system.
  • #18 Move between high-level view and detail.HDFS is a service, but it runs on lots of servers.I need to see both.
  • #19 That’s just storage.Lots of other services: query tools, analytics and more.Complex, multi-tenant, mission-critical infrastructure.Integrate with data center operations.
  • #20 Hadoop is not an island.It is part of your enterprise IT platform.We were right.
  • #21 Pick your graph: Big data is a big deal.The platform is here today.The next 12 months will be about use cases.About tooling and apps.Let me show you some cool ones. These companies are all here today.
  • #22 WibiData is Odiago’s core product – a platform for developing personalized applications with Hadoop and HbaseWibiData provides both programmatic APIs for Application Development and an ODBC interface for easy integration with existing BI / Reporting / Analysis technology + libraries that make personalization quick and easyFoneDoktor is one such application, powered by WibiDataFoneDoktor is free for Consumers:Learn from your dataShare with the community -> get more value from your dataAvailable at fonedoktor.comFoneDoktor is available to Partners (Carriers and OEMs):Lower Device Return RatesLower Support VolumeMeasure Device / Network performanceWibiData + FoneDoktor deep dive in Aaron and Garrett’s talk – check it out!
  • #23 Need self-service tools for behavioral analytics.Interactive, visual tools for business users to explore data themselves.Cetas provides real-time, interactive analytics.Automatic discover and highlight clusters and trends in data.Mask complexity, deliver big data analysis to business users.
  • #24 R is a statistical language for developing advanced analyticsWith Hadoop, R can explore all the data: No sampling, no subsetting.R language runs under MapReduceStatistician focuses on analysis, not HadoopFraud and Risk analysisPortfolio optimizationAnything you can model in R
  • #25 Validated by customers in the US Army and intelligence spaceOperates on key enterprise information (financial intelligence, risk, and patents)Combines enterprise data with public sourcesStructured, semi-structured and complexDiscovers and shows connections, relationships among entities
  • #26 Enterprise Performance ManagementKey metrics, trends, analysies: Plan, budget, forecastHadoop for trending, diverse data sources, external and internalWith drill-downAimed at busy execs who need clear insight and overviewiPad, iPhone applications
  • #27 It’s getting crowded in here!Companies contributing to Hadoop, integrating with it or building on top.Sign of a big, robust market.But these aren’t the only people who have spotted the opportunity in big data.I’d like to bring up Ping Li from Accel Partners with an exciting announcement.
  • #34 Hadoop as the hubCatch, process, summarize the firehoseIntegrate with new and existing platforms for special-purpose workloadsAlready happening
  • #35 Three years talking speeds and feedsThe story for the future is value:Business problems and solutions built on big data.