Hadoop is an open source software framework that allows for the distributed storage and processing of extremely large datasets across clusters of commodity hardware. It uses a scalable distributed file system called HDFS to store data reliably, and its MapReduce programming model enables parallel processing of huge datasets across large clusters of servers. The Hadoop ecosystem includes additional popular tools like Pig, Hive, HBase, and Zookeeper that provide SQL-like querying, real-time database access, and coordination services to make the Hadoop platform more full-featured and user-friendly.