Hadoop Futures
What to watch

Tom White, Cloudera
Hadoop User Group UK, Bristol
10 August 2009
About me
▪   Apache Hadoop Committer, PMC
    Member, Apache Member
▪   Employed by Cloudera
▪   Author of “Hadoop: The De...
Goals
▪   Modular
    ▪   E.g. pluggable block placement algorithm
▪   Multiple languages
    ▪   E.g. not just Java for M...
The Project Split
▪   Core -> Common, HDFS, MapReduce
▪   New repositories
▪   New mailing lists
    ▪   {common,hdfs,mapr...
Releases
▪   0.18.3 - 29 Jan 2009
    ▪   Official “stable” release
    ▪   Probably the most commonly used
    ▪   Basis ...
Hadoop 1.0
▪   After 0.21 release
▪   Need to establish rules about version evolution
    ▪   Hadoop 1.0 Interface Classifi...
Interesting Projects/JIRAs
▪   Common
    ▪   Avro for Hadoop RPC - HADOOP-6170
    ▪   Service lifecycle - HDFS-326
    ▪...
Interesting Projects/JIRAs (continued)
▪   MapReduce
    ▪   Metadata in Serialization - HADOOP-6165
    ▪   Compute split...
Popular JIRAs
▪   http://community.cloudera.com/
Questions?
▪   tom@cloudera.com


▪   Cloudera’s Distribution for Hadoop
    ▪   http://www.cloudera.com/hadoop
(c) 2009 Cloudera, Inc. or its licensors.  "Cloudera" is a registered trademark of Cloudera, Inc.. All rights reserved. 1.0
Hadoop Futures
Upcoming SlideShare
Loading in...5
×

Hadoop Futures

4,701

Published on

Tom White's talk on Hadoop futures

0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,701
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
169
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Hadoop Futures

  1. 1. Hadoop Futures What to watch Tom White, Cloudera Hadoop User Group UK, Bristol 10 August 2009
  2. 2. About me ▪ Apache Hadoop Committer, PMC Member, Apache Member ▪ Employed by Cloudera ▪ Author of “Hadoop: The Definitive Guide” ▪ http://hadoopbook.com
  3. 3. Goals ▪ Modular ▪ E.g. pluggable block placement algorithm ▪ Multiple languages ▪ E.g. not just Java for MapReduce ▪ Integration with other systems ▪ E.g. JMX monitoring hooks
  4. 4. The Project Split ▪ Core -> Common, HDFS, MapReduce ▪ New repositories ▪ New mailing lists ▪ {common,hdfs,mapreduce}-{user,dev,issues}@hadoop.apache.org ▪ New directory layouts ▪ New configuration ▪ hadoop-site.xml -> {core,hdfs,mapreduce}-site.xml ▪ More information at ▪ http://www.cloudera.com/blog/2009/07/17/the-project-split/ ▪ general@hadoop.apache.org
  5. 5. Releases ▪ 0.18.3 - 29 Jan 2009 ▪ Official “stable” release ▪ Probably the most commonly used ▪ Basis for first Cloudera distribution ▪ 0.19.2 - 23 July 2009 ▪ 0.19 series is not widely used ▪ 0.20.0 - 22 April 2009 ▪ Expect large adoption with 0.20.1 release in coming weeks ▪ Basis for second Cloudera distribution, first Yahoo! distribution ▪ 0.21 series - feature freeze end of August 2009
  6. 6. Hadoop 1.0 ▪ After 0.21 release ▪ Need to establish rules about version evolution ▪ Hadoop 1.0 Interface Classification - HADOOP-5073 ▪ API, Data, wire protocol compatibility - HADOOP-5071
  7. 7. Interesting Projects/JIRAs ▪ Common ▪ Avro for Hadoop RPC - HADOOP-6170 ▪ Service lifecycle - HDFS-326 ▪ Distributed configuration - HADOOP-5670 ▪ 10 minute patch builds - HADOOP-5628, HDFS-458, MAPREDUCE-670 ▪ Ivy/Maven integration - HADOOP-5107 ▪ Eclipse plugin
  8. 8. Interesting Projects/JIRAs (continued) ▪ MapReduce ▪ Metadata in Serialization - HADOOP-6165 ▪ Compute splits on the cluster - MAPREDUCE-207 ▪ Context Objects - ongoing migration of libraries/examples ▪ Security - HADOOP-4487 ▪ Schedulers ▪ Fair share scheduler - global scheduling, FIFO - MAPREDUCE-548, MAPREDUCE-706 ▪ Capacity - high RAM jobs - HADOOP-5884 ▪ Speed: new shuffle ▪ See http://sortbenchmark.org/Yahoo2009.pdf
  9. 9. Popular JIRAs ▪ http://community.cloudera.com/
  10. 10. Questions? ▪ tom@cloudera.com ▪ Cloudera’s Distribution for Hadoop ▪ http://www.cloudera.com/hadoop
  11. 11. (c) 2009 Cloudera, Inc. or its licensors.  "Cloudera" is a registered trademark of Cloudera, Inc.. All rights reserved. 1.0
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×