• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Apache Hadoop 0.23
 

Apache Hadoop 0.23

on

  • 11,843 views

The Apache Hadoop community is gearing up for the upcoming release of Apache Hadoop 0.23 - the first major release since 0.20 in 2009. This release has major enhancements to Hadoop such as HDFS ...

The Apache Hadoop community is gearing up for the upcoming release of Apache Hadoop 0.23 - the first major release since 0.20 in 2009. This release has major enhancements to Hadoop such as HDFS Federation for hyper-scale and a Next Generation MapReduce framework. Arun, the Apache Hadoop Release Master for 0.23, willcover the highlights of the release and talk about efforts undertaken to test, stabilize and release Hadoop.next. The talk covers some of the timelines for the release, our plans for compatibility and upgrade paths for existing users of Hadoop.

Presented at Bay Area Hadoop User Group at Yahoo on 8/25/2011.

Statistics

Views

Total Views
11,843
Views on SlideShare
10,669
Embed Views
1,174

Actions

Likes
21
Downloads
0
Comments
0

12 Embeds 1,174

http://d.hatena.ne.jp 1101
http://us-w1.rockmelt.com 31
http://twitter.com 17
http://tweetedtimes.com 6
http://webcache.googleusercontent.com 5
https://twitter.com 4
http://www.moriwaki.net 4
http://a0.twimg.com 2
http://trunk.ly 1
https://si0.twimg.com 1
https://abs.twimg.com 1
http://slideclip.b-prep.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

Apache Hadoop 0.23 Apache Hadoop 0.23 Presentation Transcript

  • Apache Hadoop 0.23
    Arun C. Murthy
    Hortonworks Founder and Architect
    @acmurthy
    (@hortonworks)
    © Hortonworks Inc. 2011
    August 25, 2011
  • Hello! I’m Arun…
    Architect & Lead, Apache Hadoop MapReduce Development Team at Hortonworks (formerly at Yahoo!)
    Apache Hadoop Committer and Member of PMC
    Full-time contributor to Apache Hadoop since early 2006
    Apache HadoopRelease Manager for hadoop-0.23
  • hadoop-0.23
    On track to be first stable, and widely deployed, release since hadoop-0.20 in 2009
    All stable releases of Hadoop today are based on hadoop-0.20
    Multiple folks and entities collaborating: Hortonworks, Yahoo, Cloudera, EBay etc.
    hadoop-0.23 branch in Apache hours away!
    © Hortonworks Inc. 2011
    4
  • Highlights
    HDFS Federation
    http://www.hortonworks.com/an-introduction-to-hdfs-federation/
    NextGenerationHadoopMapReduce
    http://www.slideshare.net/hortonworks/nextgen-apache-hadoop-mapreduce
    Coming soon – HDFS High Availability
    https://issues.apache.org/jira/browse/HDFS-1623
    WIP: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-1623/
    © Hortonworks Inc. 2011
    5
  • More…
    Build - Full Mavenization
    EditLogs re-write
    https://issues.apache.org/jira/browse/HDFS-1073
    HDFS Write pipeline improvements for Hbase
    Append/flush etc.
    Re-implementation of MapReduce Shuffle
    30% performance gain
    Stability using netty rather than jetty
    Small jobs optimizations

    © Hortonworks Inc. 2011
    6
  • Deployment goals
    Clusters of 6,000machines
    Each machine with 16 cores, 48G/96G RAM, 24TB/36TB disks
    100,000+ concurrent tasks
    10,000 concurrent jobs
    © Hortonworks Inc. 2011
    7
  • Testing
    Currently tested at reasonable scale - ~500 nodes incl. GridMixv3
    Continue to improve on performance benchmarks
    GridMixv3
    Sort
    Shuffle
    HDFS
    Scan
    HDFS throughput

    © Hortonworks Inc. 2011
    8
  • Timelines
    branch-0.23 – August 2011
    Alpha (hadoop-0.23.0) - ~October 2011
    Production – late Q1 2012
    YMMV! 
    © Hortonworks Inc. 2011
    9
  • Thank You.@acmurthy
    © Hortonworks Inc. 2011