Hadoop is an open-source framework for distributed storage and processing of large datasets across clusters of computers. It was developed to support distributed processing of large datasets. The document provides an overview of Hadoop architecture including HDFS, MapReduce and key components like NameNode, DataNode, JobTracker and TaskTracker. It also discusses Hadoop history, features, use cases and configuration.