Hadoop is an open-source software framework for distributed storage and processing of large datasets across clusters of commodity hardware. It allows for the reliable, scalable, and distributed processing of large data sets across clusters of commodity servers. Hadoop supports the processing of structured and unstructured data in parallel and handles hardware failures without losing data. Major companies like Google, Yahoo, Facebook and others use Hadoop for applications such as log analysis, data mining, and scientific applications.