This document provides an overview of a data processing project in the Hadoop ecosystem. It uses technologies like HDFS, Java, MapReduce, HBase, and MongoDB. The project loads big data from files into HDFS, sorts and shuffles the data using MapReduce algorithms, and stores the output in the MongoDB database. The development environment includes Linux, Eclipse IDE, Cloudera VM with Hadoop daemons, and MongoDB tools. Screenshots are provided of the input files, MongoDB database, Cloudera dashboard, and output files.