Building Applications on YARN




Chris Riccomini
10/11/2012
Staff Software Engineer at LinkedIn
      http://riccomini.name
           @criccomini
What I want to Talk About
Anatomy of a YARN Application

Things to consider when building your application
  Architecture
  Operations
Anatomy of a YARN App
Client

Application Master

Container Code

Resource Manager

Node Manager
Anatomy of a YARN App
Client
                     Client
                     Client    RM
                               RM
Application Master

Container Code

Resource Manager
                     NM
                     NM        NM
                               NM
Node Manager

                     AM
                     AM         CC
                                CC


                              * simplified
A lot to consider
Deployment            Logging

Metrics               Fault Tolerance

Configuration         Isolation

Security              Dashboard

Language              State
Deployment
HDFS

HTTP

File (NFS)

DDOS’ing your servers

What we do: Tarball over HTTP. Life is easier with HDFS,
but operational overhead is too high.
Metrics
Application-level metrics

YARN-level metrics

metrics2

Containers are transient

What we do: Both app-level and framework-level metrics use
same metrics framework. Pipe to in-house metrics
dashboard. We don’t use metrics2 since we don’t want a
dependency on Hadoop in our core jar.
Metrics
Configuration
YARN config (yarn-site.xml, core-site.xml, etc)

Application Configuration

Transporting Configuration

What we do: Config is fully resolved at client execution time.
No admin-override/locked config protection yet. Config is
passed from client to AM to containers via environment
variables.
Security
Kerberos?

Firewalls are your friend

Gateway machine

Dashboard

What we do: Firewall all YARN machines so they can only
talk to each-other. All users go through LDAP controlled
dashboard.
Language
Favor complexity in Application Master, and make
container-logic thin

Talk to RM via REST

Potential to talk to RM via Protobuf RPC

What we do: Application AM is Java. Tasks-side of
application has Python and Java implementations.
Logging
Local storage (application is running)

HDFS storage (application has stopped for a while)

Be careful with STDOUT/STDERR (rollover)

What we do: No HDFS. Logs sit for 7 days, then disappear.
Not ideal.
Fault Tolerance
Failure matrix

HA RM/NM

Orphaned processes

Pay attention to process trees

What we do: No HA. Manual fail over when RM dies.
Orphaned process monitor (proc start time < RM start time).
Fault Tolerance
Isolation
Memory

Disk

CPU

Network

What we do: Nothing, right now. Hoping YARN will solve
this before we need it (cgroups?).
Dashboard
Application-specific information

Integrate with YARN

Application Master or Standalone?

What we do: Dashboard enforces security, talks to RM/AM
via HTTP/JSON to get information about jobs.
Dashboard
State
HDFS

Deployed with Application

Remote data store

What we do: Nothing, right now.
Takeaways
There’s a lot more than just the YARN API

Look for examples (Spark, Storm, Map-Reduce)

Decide your level of Hadoop integration
  Metrics2

  HDFS

  Config

  Kerberos and doAs
Questions?

Building Applications on YARN