Hadoop is an open-source software framework that allows for the distributed processing of large data sets across clusters of computers. It was created in 2005 by Doug Cutting and Mike Carafella at Yahoo!, with Cutting naming it after his son's toy elephant. Hadoop features include reliable data storage with the Hadoop Distributed File System (HDFS), and its MapReduce programming model for large-scale data processing using a distributed algorithm on a computing cluster.