Fostering Friendships - Enhancing Social Bonds in the Classroom
YARN.pptx
1. YARN
Yet Another Resource Negotiator
Apache YARN is Hadoop’s cluster resource management system.
It was introduced in Hadoop 2 to improve the MapReduce
implementation.
It is general enough to support other distributed computing paradigm.
YARN provides APIs for requesting and working with cluster resources,
but these APIs are not typically used directly by user code.
Users write to higher-level APIs provided by distributed computing
frameworks, which themselves are built on YARN and hide the resource
management details from the user.
2.
3. Anatomy of a YARN Application Run
YARN provides is core services via two types of long running
daemon
1. A Resource Manager ( One per cluster) - to manage the use of
resources across the cluster
2. A Node Managers – running on all the nodes in the cluster to
launch and monitor Containers
A container executes an application-specific process with a
constrained set of resources ( memory, CPU, and so on)
Depending on how YARN is configured, a container may be a Unix
process or a Linux cgroup.
4.
5.
6.
7. To run YARN , one needs to designate one machine as a resource
manager. The simplest way to do this is to set the property
yarn.resourcemanager.hostname to the hostname or IP address of
the machine running the resource manager.
Many of the resource manager’s server addresses are derived from
this property. For example, yarn.resourcemanager.address takes
the form of a host-port pair , and the host defaults to
yarn.resourcemanager.hostname.
In a MapReduce client configuration, this property is used to
connect resource manager over RPC