This document provides an overview of Hadoop, an open-source software framework for distributed storage and processing of large datasets across commodity hardware. It discusses Hadoop's history and goals, describes its core architectural components including HDFS, MapReduce and their roles, and gives examples of how Hadoop is used at large companies to handle big data.