Apache Hadoop is a framework for distributed computation and storage of very large data sets on computer clusters. Hadoop began as a project to implement Google’s MapReduce programming model and has become synonymous with a rich ecosystem of related technologies, not limited to Apache Pig, Apache Hive, Apache Spark, Apache HBase, and others