The document provides an overview of Hadoop and its core components. It discusses:
- Hadoop is an open-source framework for distributed storage and processing of large datasets across clusters of computers.
- The two core components of Hadoop are HDFS for distributed storage, and MapReduce for distributed processing. HDFS stores data reliably across machines, while MapReduce processes large amounts of data in parallel.
- Hadoop can operate in three modes - standalone, pseudo-distributed and fully distributed. The document focuses on setting up Hadoop in standalone mode for development and testing purposes on a single machine.