TiE Big Data panel
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
35,457
On Slideshare
34,989
From Embeds
468
Number of Embeds
5

Actions

Shares
Downloads
120
Comments
2
Likes
5

Embeds 468

http://www.cetas.net 274
http://paper.li 122
http://cetas.net 61
http://a0.twimg.com 10
http://us-w1.rockmelt.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. TiE SV Big Data Panel Oct 13, 2011
  • 2. What did Google do? Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 3. What did Google do? Store files Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 4. What did Google do? Process data Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 5. What did Google do? Ingest data Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 6. What did Google do? Store records & tables Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 7. What did Google do? High level domain specific language Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 8. What did Google do? Chain together complex workloads Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 9. What did Google do? Schedule them Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 10. What did Google do? Columnar format + metadata Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 11. What did Google do? End user queries Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 12. What did Google do? Coordinate within system Dremel Evenflow Evenflow Dremel MySQL Sawzall Bigtable Gateway MapReduce / GFS Chubby ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 13. The pattern repeated HiPal Databee Databee Hive Hive HBase Scribe Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 14. The pattern repeated Oozie Oozie Hive Pig & Hive Data HBase Highway Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 15. The pattern repeated Azkaban Azkaban Sqoop Pig Voldemort Kafka Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 16. The pattern repeated Cloudera’s Distribution Including Apache Hadoop Hue Hue Oozie Oozie Hive Sqoop Hive / Pig HBase Flume Zookeeper ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 17. Project summaryTopic Project(s)File storage HDFSRecord storage Hbase, Hypertabe, AccumuloMetadata storage Hive, HcatalogBatch data processing MapReduceStreaming data processing S4, StormGraph processing Giraph, X-RimeQuery language HiveDataflow language PigDatabase integration SqoopEvent data collection Flume, ScribeTest & assembly BigtopDistributed lock ZookeeperWeb access HueWorkflow Oozie, AzkabanFile format Avro, RCFile, Protocol Buffers, Sequence File ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 18. POSSIBLE withBIG DATA anything is
  • 19. Celebrate Next Saturday