Hadoop is an open-source framework for distributed processing and storage of large datasets across clusters of commodity hardware. It allows for the parallel processing of large datasets across multiple nodes. Hadoop can store and process huge amounts of structured, semi-structured, and unstructured data from various sources quickly using distributed computing. It provides capabilities for fault tolerance, flexibility, and scalability.