Hadoop is everywhere these days, but it can seem like a complex, intimidating ecosystem to those who have yet to jump in.
In this hands-on workshop, Alex Dean, co-founder of Snowplow Analytics, will take you "from zero to Hadoop", showing you how to run a variety of simple (but powerful) Hadoop jobs on Elastic MapReduce, Amazon's hosted Hadoop service. Alex will start with a no-nonsense overview of what Hadoop is, explaining its strengths and weaknesses and why it's such a powerful platform for data warehouse practitioners. Then Alex will help get you setup with EMR and Amazon S3, before leading you through a very simple job in Pig, a simple language for writing Hadoop jobs. After this we will move onto writing a more advanced job in Scalding, Twitter's Scala API for writing Hadoop jobs. For our final job, we will consolidate everything we have learnt by building a more sophisticated job in Scalding.