Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Evolution of a big data project
Next
Download to read offline and view in fullscreen.

1

Share

Download to read offline

The evolution of web and big data

Download to read offline

The evolution of web and big data

  1. 1. The Evolution ofWeb and Big Data Edward J. Yoon
  2. 2. Who Am I• Edward J. Yoon – @eddieyoon• Founder of Apache Hama• PMC member of Apache BigTop• Oracle Employee
  3. 3. Early era of Web Google• 2003: GFS• 2004: MapReduce OSS• 2005: SawZall • 2005: Hadoop• 2006: BigTable HDFS MapReduce • 2006: Pig • 2007: Hive HBase
  4. 4. Google?• World best “Full-text search engine”• In 2003, – 10,000+ Servers – 4+ billion Documents – 300+ Million Images
  5. 5. Hadoop 1.0• HDFS + MapReduce – And Pig. Hive, Hbase, Mahout
  6. 6. The New era of Web Google OSS• 2010: Pregel • 2010: Hama Dremel Twitter• 2012: Spanner Storm • 2011: YARN Giraph • 2012: Drill
  7. 7. MR vs. Alternatives
  8. 8. YARN?• Job scheduling and cluster resource management
  9. 9. Future of CDH4 and Hadoop• CDH4 will be based on 0.23.x or later• 0.23.0 doesn’t include Map/Reduce 1.0 – Storm, Giraph, Hama, Spark, MPI, GraphLab
  • jazzwang

    Nov. 1, 2012

Views

Total views

1,206

On Slideshare

0

From embeds

0

Number of embeds

7

Actions

Downloads

12

Shares

0

Comments

0

Likes

1

×