This document provides an overview of cloud computing and distributed systems. It discusses large scale distributed systems, cloud computing paradigms and models, MapReduce and Hadoop. MapReduce is introduced as a programming model for distributed computing problems that handles parallelization, load balancing and fault tolerance. Hadoop is presented as an open source implementation of MapReduce and its core components are HDFS for storage and the MapReduce framework. Example use cases and running a word count job on Hadoop are also outlined.