YARN- way to
developer at Hortonworks Inc.,
Contributor to Apache YARN and
Worked on resource localization
(distributed cache), security
Currently working on resource manager
Resource localization (Distributed cache)
to write custom application on YARN
How to contribute to Open source
to ~4000 cluster nodes
Maximum ~40000 concurrent tasks
Synchronization in Job Tracker becomes
Job Tracker fails then everything fails.
Users will have to resubmit all the jobs.
Very poor cluster utilization
map and reduce slots
support to run and share cluster
resources with NON MAPREDUCE
Lacks support for wire compatibility.
clients need to have same version.
So what do we need?
nodes, 10K+ jobs
Better resource utilization
Support for multiple application
Support for aggregating logs
Easy to up grade the cluster
separate the logic of managing
cluster resources from managing
All the applications including MAPREDUCE
will run in user land.
isolation in secure cluster
More fault tolerant.
submitted by user
like job tracker
For MAPREDUCE it will manage all the map
and reduce tasks – progress, restart etc.
of allocation(simple process)
Replacing fixed map and reduce slots
Eg. Container 1 = 2 GB, 4 CPU
resource scheduler (Pluggable)
Stores App state (No need to resubmit
application if RM restarts)
machine ..think like task tracker
Manages container life cycle
Aggregating application logs
Map Reduce task status
Resource Request and app status
How job gets executed?
Client submits application. (ex. MAPREDUCE)
RM asks NM to start Application Master (AM)
Node manager starts application master inside a container (process).
Application Master first registers with RM and then keeps requesting
new resources. On the same AMRM protocol it also reports the
application status to RM.
When RM allocates a new container to AM it then goes to the
specified NM and requests it to launch container (eg. Map task).
Newly stated container then will follow the application logic and
keep reporting to AM its progress.
Once done AM informs RM that application is successful.
RM then informs NM about finished application and asks it to start
aggregating logs and cleaning up container specific files.
(Default is Capacity Scheduler)
think of it as queues per Organization
limit (range of resources to use)
Black/White listing of resources.
Supports resource priorities
Security – queue level ACLs
Find more about Capacity Scheduler
node manager launches container it
needs the executable to run
Resources (files) to be downloaded should be
specified as a part of Container launch
PUBLIC : - accessible to all
PRIVATE :- accessible to all containers of a single
APPLICATION :- only to single application
Resource Localization contd..
localizer downloads public
resources(owned by NM).
Private localizer downloads private and
application resources(owned by user).
Per user quota not supported yet.
LRU cache with configurable size.
As soon as it is localized it looses any
connection with remote location.
Public localizer supports parallel download
where as private localizer support limited
Resource Localization contd..
AM requests 2
resources while starting
R1 – Public
R2 - Application
the users can not be trusted. Confidential
data /application’s data need to be
Resource manager and Node managers are
started as “yarn(super)” users.
All applications and containers run as user
who submitted the job
Use LinuxContainerExecutor to launch user
process. (see container-executor.c)
Private localizers too run as app_user.
while submitting the job.
AMRMToken :- for AM to talk to RM.
NMToken :- for AM to talk to NM for launching
ContainerToken :- way for RM to pass
container information from RM to NM via AM.
Contains resource and user information
:- Used by private localizer
during resource localization
RMDelegationToken :- useful when kerberos
(TGT) is not available.
Resource manager restart
for Zookeeper and HDFS based
recover application from saved
state. No need to resubmit the
Today support only non work preserving
Lays foundation for RM-HA
YARN paper received best
paper award!! J
work preserving mode ..almost done
Work preserving mode .. Needs more effort
HA .. Just started
Task / container preemption..
Support for long running services
Different applications already
running on YARN
Spark (real time processing)
Apache Hbase (HOYA)
Apache Helix( incubator project)
Apache Samza (incubator project)
Writing an application on
Take a look at Distributed shell
Write Application Master which once started will
First register itself with RM on AMRMprotocol
Keep heartbeating and requesting resources via
Use container management protocol to launch
future containers on NM.
Once done notify RM via finishApplicationMaster
Always use AMRMClient and NMClient while
talking to RM / NM.
Use distributed cache wisely.
Want to contribute to Open
Subscribe to apache user, yarn dev/issues
mailing list link
Post your questions on user mailing list.
Try to be specific and add more information to
get better and quick replies
Try to be patient.
with simple tickets to get an idea about
the underlying component.