Hadoop is a distributed processing framework for large datasets. It utilizes HDFS for storage and MapReduce as its programming paradigm. The Hadoop ecosystem has expanded to include many other tools. YARN was developed to address limitations in the original Hadoop architecture. It provides a framework for multiple data processing engines and improves cluster utilization, scalability, and agility. YARN introduces a generalized resource management model and separates generic services from application logic. This allows different applications like batch, interactive, and streaming to run concurrently on the same Hadoop cluster.