In this lecture, we explore how to construct resilient distributed systems on top of unreliable components. Starting, almost two decades ago, with Leslie Lamport’s work on organising parliament for a Greek island. We will take a journey to today’s datacenter and the systems powering companies like Google, Amazon and Microsoft. Along the way, we will face interesting impossibility results, machines acting maliciously and the complexity to today’s networks. We will discover how to reach agreement between many parties and from this, how we construct the fault-tolerance systems that we depend upon everyday.
This lecture was given on October 13th 2015 at the University of Cambridge, as part of the Research Students Lecture Series.