SlideShare a Scribd company logo
1 of 17
Apache Hadoop
YARN - Under the Hood


      Sharad Agarwal
    sharad@apache.org
About me
• Apache Foundation
  – Hadoop Committer and PMC member
  – Hadoop MR contributor for over 4 years
  – Author of Hadoop Nextgen/YARN core and MR
    Application Master
  – Organizer of Hadoop Bangalore Meetups

• Head of Technology Platforms @InMobi
  – Formerly Architect @Yahoo!
• Component
              Architecture

YARN CORE   • Concurrency

            • State
              Management
Concurrent
 Systems
Concurrent System ?
// Loop through all expired items in the queue
//
// Need to lock the JobTracker here since we are
// manipulating it's data-structures via
// ExpireTrackers.run -> JobTracker.lostTaskTracker ->
// JobInProgress.failedTask -> JobTracker.markCompleteTaskAttempt
// Also need to lock JobTracker before locking 'taskTracker' &
// 'trackerExpiryQueue' to prevent deadlock




// This method is synchronized to make sure that the locking order
// "taskTrackers lock followed by faultyTrackers.potentiallyFaultyTrackers
// lock" is under JobTracker lock to avoid deadlocks.
JobTracker         JIP                TIP   Scheduler

 Heartbeat
 Request




Heartbeat
Response




                         JobTracker Global Lock
Highly Concurrent Systems
• scales much better (if done right)
• makes effective use of multi-core hardwares
• even more important in master slave
  architectures
• overall job latencies of the systems can come
  down drastically
• managing eventual consistency of states hard
• need for a systemic framework to manage this
Event Queue                   Event
                                                        Dispatcher




        Component          Component        Component
           A                  B                N

•   Mutations only via events
•   Components only expose Read APIs
•   Use Re-entrant locks
•   Components follow clear lifecycle


                              Event Model
Heartbeat
              Listener       Event Q            NM Info

 Heartbeat
 Request




                                       Get commands




Heartbeat
Response




                 Asynchronous
               Heartbeat Handling
RM                            NM

                 Heartbeats


                     MR Scheduler                 Container Launcher




                                    Event Queue




Client Handler    Job                Task         Attempt          Task Listener


                                                                          Heartbeats


Job Client                         MR AM                               Task
State
Management
Looks Familiar ?
public synchronized void updateTaskStatus(TaskInProgress tip,
                       TaskStatus status) {

  double oldProgress = tip.getProgress(); // save old progress
  boolean wasRunning = tip.isRunning();
                                                                              Very Hard to Maintain
  boolean wasComplete = tip.isComplete();
  boolean wasPending = tip.isOnlyCommitPending();                             Debugging even harder
  TaskAttemptID taskid = status.getTaskID();
  boolean wasAttemptRunning = tip.isAttemptRunning(taskid);

  // If the TIP is already completed and the task reports as SUCCEEDED then
  // mark the task as KILLED.
  // In case of task with no promotion the task tracker will mark the task
  // as SUCCEEDED.
  // User has requested to kill the task, but TT reported SUCCEEDED,
  // mark the task KILLED.
  if ((wasComplete || tip.wasKilled(taskid)) &&
     (status.getRunState() == TaskStatus.State.SUCCEEDED)) {
    status.setRunState(TaskStatus.State.KILLED);
  }

  // If the job is complete and a task has just reported its
  // state as FAILED_UNCLEAN/KILLED_UNCLEAN,
  // make the task's state FAILED/KILLED without launching cleanup attempt.
  // Note that if task is already a cleanup attempt,
  // we don't change the state to make sure the task gets a killTaskAction
  if ((this.isComplete() || jobFailed || jobKilled) &&
      !tip.isCleanupAttempt(taskid)) {
    if (status.getRunState() == TaskStatus.State.FAILED_UNCLEAN) {
      status.setRunState(TaskStatus.State.FAILED);
    } else if (status.getRunState() == TaskStatus.State.KILLED_UNCLEAN) {
      status.setRunState(TaskStatus.State.KILLED);
    }
  }
Complex State Management
• Light weight State Machines Library
  – Declarative way of specifying the state Transitions
  – Invalid transitions are handled automatically
  – Fits nicely with the event model
  – Debuggability is drastically improved. Lineage of
    object states can easily be determined
  – Handy while recovering the state
Declarative State Machine
StateMachineFactory<JobImpl, JobState, JobEventType, JobEvent> stateMachineFactory
    = new StateMachineFactory<JobImpl, JobState, JobEventType, JobEvent>(JobState.NEW)

     // Transitions from NEW state
     .addTransition(JobState.NEW, JobState.NEW,
        JobEventType.JOB_DIAGNOSTIC_UPDATE,
        DIAGNOSTIC_UPDATE_TRANSITION)
     .addTransition(JobState.NEW, JobState.NEW,
        JobEventType.JOB_COUNTER_UPDATE, COUNTER_UPDATE_TRANSITION)
     .addTransition
        (JobState.NEW,
        EnumSet.of(JobState.INITED, JobState.FAILED),
        JobEventType.JOB_INIT,
        new InitTransition())
     .addTransition(JobState.NEW, JobState.KILLED,
        JobEventType.JOB_KILL,
        new KillNewJobTransition())
YARN: New Possibilities
•   Open MPI - MR-2911
•   Master-Worker – MR-3315
•   Distributed Shell
•   Graph processing – Giraph-13
•   BSP – HAMA-431
•   CEP – Storm
     • https://github.com/nathanmarz/storm/issues/74
• Iterative processing - Spark
     • https://github.com/mesos/spark-yarn/
Thank You!
@twitter: sharad_ag

More Related Content

Viewers also liked

Hilton Situation Analysis
Hilton Situation AnalysisHilton Situation Analysis
Hilton Situation Analysisjlw264
 
YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011Sharad Agarwal
 

Viewers also liked (7)

In House Lawyer Seminar
In House Lawyer SeminarIn House Lawyer Seminar
In House Lawyer Seminar
 
Cloud and compliance REX
Cloud and compliance REXCloud and compliance REX
Cloud and compliance REX
 
Hilton Situation Analysis
Hilton Situation AnalysisHilton Situation Analysis
Hilton Situation Analysis
 
σχολικά εφόδια
σχολικά  εφόδιασχολικά  εφόδια
σχολικά εφόδια
 
YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011
 
Presentation IDS
Presentation IDSPresentation IDS
Presentation IDS
 
Lectures on typefaces
Lectures on typefacesLectures on typefaces
Lectures on typefaces
 

Similar to YARN Under The Hood

Hadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHanborq Inc.
 
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)Sharad Agarwal
 
Implementing Lightweight Networking
Implementing Lightweight NetworkingImplementing Lightweight Networking
Implementing Lightweight Networkingguest6972eaf
 
Qtp92 Presentation
Qtp92 PresentationQtp92 Presentation
Qtp92 Presentationa34sharm
 
Async and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLAsync and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLArie Leeuwesteijn
 
Gearmanpresentation 110308165409-phpapp01
Gearmanpresentation 110308165409-phpapp01Gearmanpresentation 110308165409-phpapp01
Gearmanpresentation 110308165409-phpapp01longtuan
 
2010-04-13 Reactor Pattern & Event Driven Programming 2
2010-04-13 Reactor Pattern & Event Driven Programming 22010-04-13 Reactor Pattern & Event Driven Programming 2
2010-04-13 Reactor Pattern & Event Driven Programming 2Lin Jen-Shin
 
Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Ryan Cuprak
 
Container orchestration from theory to practice
Container orchestration from theory to practiceContainer orchestration from theory to practice
Container orchestration from theory to practiceDocker, Inc.
 
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.Puppet
 
Anatomy of classic map reduce in hadoop
Anatomy of classic map reduce in hadoop Anatomy of classic map reduce in hadoop
Anatomy of classic map reduce in hadoop Rajesh Ananda Kumar
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDinakar Guniguntala
 
Java EE 7 Batch processing in the Real World
Java EE 7 Batch processing in the Real WorldJava EE 7 Batch processing in the Real World
Java EE 7 Batch processing in the Real WorldRoberto Cortez
 
Building React Applications with Redux
Building React Applications with ReduxBuilding React Applications with Redux
Building React Applications with ReduxFITC
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an actionGordon Chung
 
Real Time Operating System Concepts
Real Time Operating System ConceptsReal Time Operating System Concepts
Real Time Operating System ConceptsSanjiv Malik
 
Workflow as code with Azure Durable Functions
Workflow as code with Azure Durable FunctionsWorkflow as code with Azure Durable Functions
Workflow as code with Azure Durable FunctionsMassimo Bonanni
 

Similar to YARN Under The Hood (20)

Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje CrnjakJavantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
 
Hadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep Insight
 
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
 
Spring batch
Spring batchSpring batch
Spring batch
 
Implementing Lightweight Networking
Implementing Lightweight NetworkingImplementing Lightweight Networking
Implementing Lightweight Networking
 
Implementing Lightweight Networking
Implementing Lightweight NetworkingImplementing Lightweight Networking
Implementing Lightweight Networking
 
Qtp92 Presentation
Qtp92 PresentationQtp92 Presentation
Qtp92 Presentation
 
Async and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLAsync and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NL
 
Gearmanpresentation 110308165409-phpapp01
Gearmanpresentation 110308165409-phpapp01Gearmanpresentation 110308165409-phpapp01
Gearmanpresentation 110308165409-phpapp01
 
2010-04-13 Reactor Pattern & Event Driven Programming 2
2010-04-13 Reactor Pattern & Event Driven Programming 22010-04-13 Reactor Pattern & Event Driven Programming 2
2010-04-13 Reactor Pattern & Event Driven Programming 2
 
Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)
 
Container orchestration from theory to practice
Container orchestration from theory to practiceContainer orchestration from theory to practice
Container orchestration from theory to practice
 
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
 
Anatomy of classic map reduce in hadoop
Anatomy of classic map reduce in hadoop Anatomy of classic map reduce in hadoop
Anatomy of classic map reduce in hadoop
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
Java EE 7 Batch processing in the Real World
Java EE 7 Batch processing in the Real WorldJava EE 7 Batch processing in the Real World
Java EE 7 Batch processing in the Real World
 
Building React Applications with Redux
Building React Applications with ReduxBuilding React Applications with Redux
Building React Applications with Redux
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
 
Real Time Operating System Concepts
Real Time Operating System ConceptsReal Time Operating System Concepts
Real Time Operating System Concepts
 
Workflow as code with Azure Durable Functions
Workflow as code with Azure Durable FunctionsWorkflow as code with Azure Durable Functions
Workflow as code with Azure Durable Functions
 

Recently uploaded

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

YARN Under The Hood

  • 1. Apache Hadoop YARN - Under the Hood Sharad Agarwal sharad@apache.org
  • 2. About me • Apache Foundation – Hadoop Committer and PMC member – Hadoop MR contributor for over 4 years – Author of Hadoop Nextgen/YARN core and MR Application Master – Organizer of Hadoop Bangalore Meetups • Head of Technology Platforms @InMobi – Formerly Architect @Yahoo!
  • 3. • Component Architecture YARN CORE • Concurrency • State Management
  • 5. Concurrent System ? // Loop through all expired items in the queue // // Need to lock the JobTracker here since we are // manipulating it's data-structures via // ExpireTrackers.run -> JobTracker.lostTaskTracker -> // JobInProgress.failedTask -> JobTracker.markCompleteTaskAttempt // Also need to lock JobTracker before locking 'taskTracker' & // 'trackerExpiryQueue' to prevent deadlock // This method is synchronized to make sure that the locking order // "taskTrackers lock followed by faultyTrackers.potentiallyFaultyTrackers // lock" is under JobTracker lock to avoid deadlocks.
  • 6. JobTracker JIP TIP Scheduler Heartbeat Request Heartbeat Response JobTracker Global Lock
  • 7. Highly Concurrent Systems • scales much better (if done right) • makes effective use of multi-core hardwares • even more important in master slave architectures • overall job latencies of the systems can come down drastically • managing eventual consistency of states hard • need for a systemic framework to manage this
  • 8. Event Queue Event Dispatcher Component Component Component A B N • Mutations only via events • Components only expose Read APIs • Use Re-entrant locks • Components follow clear lifecycle Event Model
  • 9. Heartbeat Listener Event Q NM Info Heartbeat Request Get commands Heartbeat Response Asynchronous Heartbeat Handling
  • 10. RM NM Heartbeats MR Scheduler Container Launcher Event Queue Client Handler Job Task Attempt Task Listener Heartbeats Job Client MR AM Task
  • 12. Looks Familiar ? public synchronized void updateTaskStatus(TaskInProgress tip, TaskStatus status) { double oldProgress = tip.getProgress(); // save old progress boolean wasRunning = tip.isRunning(); Very Hard to Maintain boolean wasComplete = tip.isComplete(); boolean wasPending = tip.isOnlyCommitPending(); Debugging even harder TaskAttemptID taskid = status.getTaskID(); boolean wasAttemptRunning = tip.isAttemptRunning(taskid); // If the TIP is already completed and the task reports as SUCCEEDED then // mark the task as KILLED. // In case of task with no promotion the task tracker will mark the task // as SUCCEEDED. // User has requested to kill the task, but TT reported SUCCEEDED, // mark the task KILLED. if ((wasComplete || tip.wasKilled(taskid)) && (status.getRunState() == TaskStatus.State.SUCCEEDED)) { status.setRunState(TaskStatus.State.KILLED); } // If the job is complete and a task has just reported its // state as FAILED_UNCLEAN/KILLED_UNCLEAN, // make the task's state FAILED/KILLED without launching cleanup attempt. // Note that if task is already a cleanup attempt, // we don't change the state to make sure the task gets a killTaskAction if ((this.isComplete() || jobFailed || jobKilled) && !tip.isCleanupAttempt(taskid)) { if (status.getRunState() == TaskStatus.State.FAILED_UNCLEAN) { status.setRunState(TaskStatus.State.FAILED); } else if (status.getRunState() == TaskStatus.State.KILLED_UNCLEAN) { status.setRunState(TaskStatus.State.KILLED); } }
  • 13.
  • 14. Complex State Management • Light weight State Machines Library – Declarative way of specifying the state Transitions – Invalid transitions are handled automatically – Fits nicely with the event model – Debuggability is drastically improved. Lineage of object states can easily be determined – Handy while recovering the state
  • 15. Declarative State Machine StateMachineFactory<JobImpl, JobState, JobEventType, JobEvent> stateMachineFactory = new StateMachineFactory<JobImpl, JobState, JobEventType, JobEvent>(JobState.NEW) // Transitions from NEW state .addTransition(JobState.NEW, JobState.NEW, JobEventType.JOB_DIAGNOSTIC_UPDATE, DIAGNOSTIC_UPDATE_TRANSITION) .addTransition(JobState.NEW, JobState.NEW, JobEventType.JOB_COUNTER_UPDATE, COUNTER_UPDATE_TRANSITION) .addTransition (JobState.NEW, EnumSet.of(JobState.INITED, JobState.FAILED), JobEventType.JOB_INIT, new InitTransition()) .addTransition(JobState.NEW, JobState.KILLED, JobEventType.JOB_KILL, new KillNewJobTransition())
  • 16. YARN: New Possibilities • Open MPI - MR-2911 • Master-Worker – MR-3315 • Distributed Shell • Graph processing – Giraph-13 • BSP – HAMA-431 • CEP – Storm • https://github.com/nathanmarz/storm/issues/74 • Iterative processing - Spark • https://github.com/mesos/spark-yarn/

Editor's Notes

  1. Does this look fragile. No one dares to touch this piece
  2. Does this look fragile. No one dares to touch this piece
  3. New Features and their impact in State Transitions can be articulated, discussed and designed