Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
A large fraction of big data projects fail to deliver return of investment, or take years before they do so. The reasons are typically a combination of project management, leadership, organisation, available competence, and technical failures. In this presentation, I will focus on the technical aspects, and present the most common or costly data engineering mistakes that I have experienced when building scalable data processing technology over the last five years, as well as advice for how to avoid them. The presentation includes war stories from large scale production environments, some that lead to reprocessing of petabytes of data, or DDoSing critical services with a Hadoop cluster, and what we learnt from the incidents.
Login to see the comments