Hadoop is an open-source software framework for distributed storage and processing of large datasets across clusters of computers. The core of Hadoop includes HDFS for distributed storage, and MapReduce for distributed processing. Other Hadoop projects include Pig for data flows, ZooKeeper for coordination, and YARN for job scheduling. Key Hadoop daemons include the NameNode, Secondary NameNode, DataNodes, JobTracker and TaskTrackers.