YARN Under The Hood


Published on

About Hadoop YARN internals. Beware it has code snipped from JobTracker :)

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

YARN Under The Hood

  1. 1. Apache HadoopYARN - Under the Hood Sharad Agarwal sharad@apache.org
  2. 2. About me• Apache Foundation – Hadoop Committer and PMC member – Hadoop MR contributor for over 4 years – Author of Hadoop Nextgen/YARN core and MR Application Master – Organizer of Hadoop Bangalore Meetups• Head of Technology Platforms @InMobi – Formerly Architect @Yahoo!
  3. 3. • Component ArchitectureYARN CORE • Concurrency • State Management
  4. 4. Concurrent Systems
  5. 5. Concurrent System ?// Loop through all expired items in the queue//// Need to lock the JobTracker here since we are// manipulating its data-structures via// ExpireTrackers.run -> JobTracker.lostTaskTracker ->// JobInProgress.failedTask -> JobTracker.markCompleteTaskAttempt// Also need to lock JobTracker before locking taskTracker &// trackerExpiryQueue to prevent deadlock// This method is synchronized to make sure that the locking order// "taskTrackers lock followed by faultyTrackers.potentiallyFaultyTrackers// lock" is under JobTracker lock to avoid deadlocks.
  6. 6. JobTracker JIP TIP Scheduler Heartbeat RequestHeartbeatResponse JobTracker Global Lock
  7. 7. Highly Concurrent Systems• scales much better (if done right)• makes effective use of multi-core hardwares• even more important in master slave architectures• overall job latencies of the systems can come down drastically• managing eventual consistency of states hard• need for a systemic framework to manage this
  8. 8. Event Queue Event Dispatcher Component Component Component A B N• Mutations only via events• Components only expose Read APIs• Use Re-entrant locks• Components follow clear lifecycle Event Model
  9. 9. Heartbeat Listener Event Q NM Info Heartbeat Request Get commandsHeartbeatResponse Asynchronous Heartbeat Handling
  10. 10. RM NM Heartbeats MR Scheduler Container Launcher Event QueueClient Handler Job Task Attempt Task Listener HeartbeatsJob Client MR AM Task
  11. 11. StateManagement
  12. 12. Looks Familiar ?public synchronized void updateTaskStatus(TaskInProgress tip, TaskStatus status) { double oldProgress = tip.getProgress(); // save old progress boolean wasRunning = tip.isRunning(); Very Hard to Maintain boolean wasComplete = tip.isComplete(); boolean wasPending = tip.isOnlyCommitPending(); Debugging even harder TaskAttemptID taskid = status.getTaskID(); boolean wasAttemptRunning = tip.isAttemptRunning(taskid); // If the TIP is already completed and the task reports as SUCCEEDED then // mark the task as KILLED. // In case of task with no promotion the task tracker will mark the task // as SUCCEEDED. // User has requested to kill the task, but TT reported SUCCEEDED, // mark the task KILLED. if ((wasComplete || tip.wasKilled(taskid)) && (status.getRunState() == TaskStatus.State.SUCCEEDED)) { status.setRunState(TaskStatus.State.KILLED); } // If the job is complete and a task has just reported its // state as FAILED_UNCLEAN/KILLED_UNCLEAN, // make the tasks state FAILED/KILLED without launching cleanup attempt. // Note that if task is already a cleanup attempt, // we dont change the state to make sure the task gets a killTaskAction if ((this.isComplete() || jobFailed || jobKilled) && !tip.isCleanupAttempt(taskid)) { if (status.getRunState() == TaskStatus.State.FAILED_UNCLEAN) { status.setRunState(TaskStatus.State.FAILED); } else if (status.getRunState() == TaskStatus.State.KILLED_UNCLEAN) { status.setRunState(TaskStatus.State.KILLED); } }
  13. 13. Complex State Management• Light weight State Machines Library – Declarative way of specifying the state Transitions – Invalid transitions are handled automatically – Fits nicely with the event model – Debuggability is drastically improved. Lineage of object states can easily be determined – Handy while recovering the state
  14. 14. Declarative State MachineStateMachineFactory<JobImpl, JobState, JobEventType, JobEvent> stateMachineFactory = new StateMachineFactory<JobImpl, JobState, JobEventType, JobEvent>(JobState.NEW) // Transitions from NEW state .addTransition(JobState.NEW, JobState.NEW, JobEventType.JOB_DIAGNOSTIC_UPDATE, DIAGNOSTIC_UPDATE_TRANSITION) .addTransition(JobState.NEW, JobState.NEW, JobEventType.JOB_COUNTER_UPDATE, COUNTER_UPDATE_TRANSITION) .addTransition (JobState.NEW, EnumSet.of(JobState.INITED, JobState.FAILED), JobEventType.JOB_INIT, new InitTransition()) .addTransition(JobState.NEW, JobState.KILLED, JobEventType.JOB_KILL, new KillNewJobTransition())
  15. 15. YARN: New Possibilities• Open MPI - MR-2911• Master-Worker – MR-3315• Distributed Shell• Graph processing – Giraph-13• BSP – HAMA-431• CEP – Storm • https://github.com/nathanmarz/storm/issues/74• Iterative processing - Spark • https://github.com/mesos/spark-yarn/
  16. 16. Thank You!@twitter: sharad_ag