SlideShare a Scribd company logo
1 of 23
Download to read offline
TiE SV Big Data Panel
               Oct 13, 2011
What did Google do?




        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                                Store files


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                          Process
                           data

        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                              Ingest data


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                              Store records & tables


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                               High level domain specific
                               language

        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                               Chain together complex workloads


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                               Schedule them


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                               Columnar format + metadata


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                               End user queries


        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
What did Google do?

                               Coordinate within
                               system

        Dremel
    Evenflow                       Evenflow                                   Dremel

    MySQL                         Sawzall
                                                                              Bigtable
    Gateway         MapReduce / GFS

                                   Chubby




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
The pattern repeated



          HiPal
     Databee                            Databee                                  Hive
                                          Hive
                                                                                 HBase
     Scribe

                                  Zookeeper




                   ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                  Reproduction or redistribution without written permission is
                                          prohibited.
The pattern repeated




      Oozie                            Oozie                                  Hive
                              Pig & Hive
      Data                                                                    HBase
     Highway
                               Zookeeper




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
The pattern repeated




     Azkaban                       Azkaban

     Sqoop                              Pig
                                                                              Voldemort
     Kafka

                               Zookeeper




                ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
               Reproduction or redistribution without written permission is
                                       prohibited.
The pattern repeated



    Cloudera’s Distribution Including Apache Hadoop
               Hue                                                                  Hue
       Oozie                                 Oozie                                        Hive

      Sqoop                           Hive / Pig
                                                                                          HBase
      Flume

                                     Zookeeper




                      ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                     Reproduction or redistribution without written permission is
                                             prohibited.
Project summary
Topic                                                    Project(s)
File storage                                             HDFS
Record storage                                           Hbase, Hypertabe, Accumulo
Metadata storage                                         Hive, Hcatalog
Batch data processing                                    MapReduce
Streaming data processing                                S4, Storm
Graph processing                                         Giraph, X-Rime
Query language                                           Hive
Dataflow language                                        Pig
Database integration                                     Sqoop
Event data collection                                    Flume, Scribe
Test & assembly                                          Bigtop
Distributed lock                                         Zookeeper
Web access                                               Hue
Workflow                                                 Oozie, Azkaban
File format                                              Avro, RCFile, Protocol Buffers, Sequence File
                             ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                            Reproduction or redistribution without written permission is
                                                    prohibited.
POSSIBLE
  with
BIG DATA
 anything
         is
Celebrate Next
  Saturday

More Related Content

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

TiE Big Data panel

  • 1. TiE SV Big Data Panel Oct 13, 2011
  • 2.
  • 3. What did Google do? Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 4. What did Google do? Store files Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 5. What did Google do? Process data Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 6. What did Google do? Ingest data Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 7. What did Google do? Store records & tables Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 8. What did Google do? High level domain specific language Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 9. What did Google do? Chain together complex workloads Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 10. What did Google do? Schedule them Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 11. What did Google do? Columnar format + metadata Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 12. What did Google do? End user queries Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 13. What did Google do? Coordinate within system Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 14. The pattern repeated HiPal Databee Databee Hive Hive HBase Scribe Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 15. The pattern repeated Oozie Oozie Hive Pig & Hive Data HBase Highway Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 16. The pattern repeated Azkaban Azkaban Sqoop Pig Voldemort Kafka Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 17. The pattern repeated Cloudera’s Distribution Including Apache Hadoop Hue Hue Oozie Oozie Hive Sqoop Hive / Pig HBase Flume Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 18. Project summary Topic Project(s) File storage HDFS Record storage Hbase, Hypertabe, Accumulo Metadata storage Hive, Hcatalog Batch data processing MapReduce Streaming data processing S4, Storm Graph processing Giraph, X-Rime Query language Hive Dataflow language Pig Database integration Sqoop Event data collection Flume, Scribe Test & assembly Bigtop Distributed lock Zookeeper Web access Hue Workflow Oozie, Azkaban File format Avro, RCFile, Protocol Buffers, Sequence File ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 19.
  • 20.
  • 21. POSSIBLE with BIG DATA anything is
  • 22.
  • 23. Celebrate Next Saturday