Anyone that has used Hadoop knows that jobs sometimes get stuck. Hadoop is powerful, and it’s experiencing a tremendous rate of innovation, but it also has many rough edges. As Hadoop practitioners, we all spend a lot of effort dealing with these rough edges in order to keep Hadoop and Hadoop jobs running well for our customers and/or organizations.
For this session, we will look at typical problems encountered by Hadoop users when running Hive, and discuss implications for the future of Hadoop development. We will also go through the solution to this kind of problem using step-by-step instructions and the specific code and logs we used to identify the issue.
As a community, we need to work together to improve this kind of experience for our industry. Now that Hadoop 2 has been shipped, we believe the Hadoop community will be able to focus its energies on rounding off rough edges like these, and this session should provide advanced users with some tools and strategies to identify issues with jobs and how to keep these running smoothly.