6. Introducing YARN
● YARN - Yet Another Resource Negotiator
● Framework that facilitates writing arbitrary distributed processing
frameworks and applications.
● YARN Applications/frameworks:
e.g. MapReduce2, Apache Spark, Apache Giraph, Apache Apex etc.
Image Source: http://tm.durusau.net/?cat=1525
7. Hadoop beyond Batch
YARN for better
resource utilization
More applications
than MapReduce
Image Source: http://tm.durusau.net/?cat=1525
8. Comparing MapReduce with YARN
MapReduce YARN
≈
≈
≈
8Proprietary and Confidential
Job Tracker
Resource Manager
Application Master
Task Tracker Node Manager
Map Slot
Reduce Slot
Backward Compatibility
Maintained!
● Existing Map Reduce
jobs run as is on the
YARN framework
● No Job Tracker and Task
Tracker processes
11. Application Masters - One for each Application Type
MapReduce Application
MapReduce
Application Master
Apex Application
Apex
Application Master
(StrAM)
Flink Application
Flink
Application Master
Giraph Application
Giraph
Application Master
Already provided by
Hadoop as a backward
compatibility option for
MapReduce
Provided by Apache
Apex
12. ● YARN enables non-MapReduce applications to run in a distributed fashion
● Each Application first asks for a container for the Application Master
○ The Application Master then talks to YARN to get resources needed by
the application
○ Once YARN allocates containers as requested to the Application Master,
it starts the application components in those containers.
● Hadoop is no more just batch processing!!
Key Takeaways