YARN config (yarn-site.xml, core-site.xml, etc)
What we do: Config is fully resolved at client execution time.
No admin-override/locked config protection yet. Config is
passed from client to AM to containers via environment
Firewalls are your friend
What we do: Firewall all YARN machines so they can only
talk to each-other. All users go through LDAP controlled
Favor complexity in Application Master, and make
Talk to RM via REST
Potential to talk to RM via Protobuf RPC
What we do: Application AM is Java. Tasks-side of
application has Python and Java implementations.
Local storage (application is running)
HDFS storage (application has stopped for a while)
Be careful with STDOUT/STDERR (rollover)
What we do: No HDFS. Logs sit for 7 days, then disappear.
Pay attention to process trees
What we do: No HA. Manual fail over when RM dies.
Orphaned process monitor (proc start time < RM start time).