Nowadays we've got all the tools we need to spin-up and tear-down clusters with hundreds of nodes in minutes and this puts more pressure on the tools we use to configure and monitor our applications. This challenge is even more interesting when we have to deal with long running distributed data storage and processing systems like Hadoop. In this talk we will look into some of the challenges we need to deal with when creating and managing Hadoop clusters in AWS, we will discuss improvement opportunities in monitoring (e.g. detecting and dealing with instance failure, resource contention & noisy neighbors) and a bit about the future and how we should go about disconnecting workload dispatch from cluster lifecycle.
Clipping is a handy way to collect important slides you want to go back to later.