Configuration
YARN config (yarn-site.xml, core-site.xml, etc)
Application Configuration
Transporting Configuration
What we do: Config is fully resolved at client execution time.
No admin-override/locked config protection yet. Config is
passed from client to AM to containers via environment
variables.
Security
Kerberos?
Firewalls are your friend
Gateway machine
Dashboard
What we do: Firewall all YARN machines so they can only
talk to each-other. All users go through LDAP controlled
dashboard.
Language
Favor complexity in Application Master, and make
container-logic thin
Talk to RM via REST
Potential to talk to RM via Protobuf RPC
What we do: Application AM is Java. Tasks-side of
application has Python and Java implementations.
Logging
Local storage (application is running)
HDFS storage (application has stopped for a while)
Be careful with STDOUT/STDERR (rollover)
What we do: No HDFS. Logs sit for 7 days, then disappear.
Not ideal.
Fault Tolerance
Failure matrix
HA RM/NM
Orphaned processes
Pay attention to process trees
What we do: No HA. Manual fail over when RM dies.
Orphaned process monitor (proc start time < RM start time).
Takeaways
There’s a lot more than just the YARN API
Look for examples (Spark, Storm, Map-Reduce)
Decide your level of Hadoop integration
Metrics2
HDFS
Config
Kerberos and doAs