Hadoop 0.23 contains major architectural changes in both HDFS and Map-Reduce frameworks. The fundamental changes include HDFS (Hadoop Distributed File System) Federation and YARN (Yet Another Resource Negotiator) to overcome the current scalability limitations of both HDFS and Job Tracker. Despite major architectural changes, the impact on user applications and programming model has been kept to a minimal to ensure that existing user Hadoop applications written in Hadoop 20 will continue to function with minimal changes. In this talk we will discuss the architectural changes which Hadoop 23 introduces and compare it to Hadoop 20. Since this is the biggest major release of Hadoop that has been adopted at Yahoo! (after Hadoop 20) in 3 years, we will talk about the customer impact and potential deployment issues of Hadoop 23 and its ecosystems. The deployment of Hadoop 23 at Yahoo! is an ongoing process and is being conducted in a phased manner on our clusters.
Presenter: Viraj Bhat, Principal Engineer, Yahoo!