The evolution of web and big data

1,079 views

Published on

  • Be the first to comment

The evolution of web and big data

  1. 1. The Evolution ofWeb and Big Data Edward J. Yoon
  2. 2. Who Am I• Edward J. Yoon – @eddieyoon• Founder of Apache Hama• PMC member of Apache BigTop• Oracle Employee
  3. 3. Early era of Web Google• 2003: GFS• 2004: MapReduce OSS• 2005: SawZall • 2005: Hadoop• 2006: BigTable HDFS MapReduce • 2006: Pig • 2007: Hive HBase
  4. 4. Google?• World best “Full-text search engine”• In 2003, – 10,000+ Servers – 4+ billion Documents – 300+ Million Images
  5. 5. Hadoop 1.0• HDFS + MapReduce – And Pig. Hive, Hbase, Mahout
  6. 6. The New era of Web Google OSS• 2010: Pregel • 2010: Hama Dremel Twitter• 2012: Spanner Storm • 2011: YARN Giraph • 2012: Drill
  7. 7. MR vs. Alternatives
  8. 8. YARN?• Job scheduling and cluster resource management
  9. 9. Future of CDH4 and Hadoop• CDH4 will be based on 0.23.x or later• 0.23.0 doesn’t include Map/Reduce 1.0 – Storm, Giraph, Hama, Spark, MPI, GraphLab

×