"Architecting for Failure in AWS" by Jos Boumans, VP of Operations, Krux Digital.
Presentation Overview: Krux is an infrastructure provider for many of the websites you use online today, like NYTimes.com, WSJ.com, Wikia and NBCU. For every request on those properties, Krux will get one or more as well. We grew from zero traffic to several billion requests per day in the span of 2 years, and we did so exclusively in AWS. As anyone using AWS will be able to tell you, there's good parts, and there's the bad ones. This is the story of all the pitfalls we encountered, and how, through architecture, convention and common sense, we managed to build an infrastructure that is "Always Up" from the end user perspective and incredibly economical to build, scale & operate.
Speaker Bio: Jos is the VP of Operations at Krux, supporting a platform with over 4 billion requests per day with a tiny Ops team. Every bit of the AWS stack is automated, monitored & graphed, with maximized resilience and minimized cost. In a previous life I ran the Ubuntu Server group at Canonical and the Database group at RIPE, which is responsible for all the authoritative IP address data in Europe, the Middle East & Asia. Jos is a regular speaker at conferences like OSCON, Devoxx, Puppetconf, etc where he mostly speaks on dealing with AWS Operations from all angles.