Apache Hadoop YARN: The Next-
generation Distributed Operating
System
Zhijie Shen & Jian He @ Hortonworks
About Us
● Two hats
o Software Engineer @ Hortonworks, Inc.
o Hadoop Committer @ The Apache Foundation
● We’re doing Apach...
Agenda
● What Is YARN
● YARN Basics
● Recent Development
● Writing Your YARN Applications
What is YARN (1)
Frequent questions on mailing list:
● What is Hadoop 2.0?
● What is YARN?
Hadoop 2.0
= YARN
= the operati...
What is YARN (2) - Motivation
● Hadoop is 8yr old
● World is significantly changed
o The hardware
o diverse data processin...
What is YARN (3) - Motivation
Flexibility - Enabling data processing model
more than MapReduce
What is YARN (5) - Motivation
● Scalability - Splitting
resource management
and job life cycle
management and
mointoring
●...
What Is YARN (4) - Motivation
● Resource Sharing - Multiple workloads in
cluster
What Is YARN (5) - Status
● YARN is shipped
● YARN is going to be better
● Hadoop 2.4 is coming this month!
● ResourceMana...
What Is YARN (6) - Status
YARN ecosystem
What is YARN (7) - Status
YARN community
Active committers working on YARN 10+
Active other contributors 20+
Total Jira ti...
Agenda
● What Is YARN
● YARN Basics
● Recent Development
● Writing Your YARN Applications
YARN Basics (1) - Concept
● ResourceManager -
Master of a cluster
● NodeManager - Slave
to take care of one host
● Applica...
YARN Basics (2) - RM
Component interfacing RM to the
clients:
● ClientRMService
Component interacting with the
per-applica...
YARN Basics (3) - NM
Component for NM-RM
communication:
● NodeStatusUpdater
Core component managing
containers on the node...
YARN Basics (4) - Workflow
Execution Sequence:
1. Client submits an application
2. RM allocates a container to start AM
3....
YARN Basics (5) - Scheduler
● FIFO
Simply first come, first serve
● Fair
o evenly share the resource across queues, among
...
Agenda
● What Is YARN
● YARN Basics
● Recent Development
● Writing Your YARN Applications
Apache Hadoop release
● Hadoop 2.4 is coming out very soon
● ResourceManager high availability
o RM Restart
o RM failover
...
ResourceManager High Availability
● RM is potentially a single point of failure.
o Restart for various reasons: Bugs, hard...
ResourceManager Restart
● RMState includes:
o static state (appSubmissionContext, credentials) , belongs to RM
o running s...
ResourceManager FailOver
● Active/Standby
o Leader election
(ZooKeeper)
● Standby on transition to
Active loads all the st...
Application Historic Data Service
● Motivation: MRv1 JobHisotoryServer
o MR job writes history data into HDFS.
o When job ...
Application Historic Data Service
● Goal: Solve the application-history problem in a generic
and more scalable way.
● A si...
Application Historic Server
Long Running Service(YARN-896)
● YARN is intended to be generic purpose.
o Short-lived (e.g.MR) apps vs long-lived apps
● ...
Agenda
● What Is YARN
● YARN Basics
● Recent Development
● Writing Your YARN Applications
Write a Yarn Applicaition
Write client and AM
Client
● Client – RM --- YarnClient
o submitApplication
ApplicationMaster
● AM – RM --- AMRMClient
o r...
Writing the YarnClient
1. Get the application Id from RM
1. Construct ApplicationSubmissionContext
a. Shell command to run...
Example: client submit an application
1. YarnClient yarnClient = YarnClient.createYarnClient();
1. YarnClientApplication a...
Writing ApplicationMaster
● The ApplicationMaster is started.
● In AM code, for each task, to start the container, we
repe...
Writing ApplicationMaster
1. AM registers with RM (registerApplicationMaster)
1. HeartBeats(allocate) with RM (asynchronou...
Writing ApplicationMaster
● RM assigns containers asynchronously
● Containers are likely not returned immediately at curre...
ContainerManagementProtocol
1. Start container (task) on NM
a. startContainer
1. Monitor the container(task) status
a. get...
Example - AM: request containers
AMRMClient rmClient = AMRMClient.createAMRMClient();
ContainerRequest containerAsk = new ...
Example - AM: start/stop containers
NMClient nmClient = NMClient.createNMClient();
for (Container container : response.get...
Takeaway
1. YARN is a resource-management platform that can host different data-
processing frameworks (beyond MapReduce)....
The book is out !
http://yarn-book.com/
http://yarn-book.com/
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distributed Operating System
Upcoming SlideShare
Loading in …5
×

ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distributed Operating System

713 views

Published on

For diverse organizations, Apache Hadoop has become the de-facto place where data & computational resources are shared. This broad usage has stretched its design beyond its intended target. To address this, Apache Hadoop community has come up with next generation of Hadoop’s compute platform: YARN.

YARN in a nutshell is the distributed Operating System of the big-data world. In this talk, we will introduce YARN, covering how the new architecture decouples programming model from resource management, scheduling functions, platform’s fault tolerance & high availability, tools for application tracing & analyses. We will then discuss the exciting ecosystem of Apache Software Foundation projects forming around YARN. We will conclude with a coverage on the applications & services being built around YARN platform which lets user chose the programming models choice, all on the same data.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
713
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
21
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • RescourceManager:
    Arbitrates resources among all the applications in the system
    NodeManager:
    the per-machine slave, which is responsible for launching the applications’ containers, monitoring their resource usage
    ApplicationMaster:
    Negatiate appropriate resource containers from the Scheduler, tracking their status and monitoring for progress
    Container:
    Unit of allocation incorporating resource elements such as memory, cpu, disk, network etc, to execute a specific task of the application (similar to map/reduce slots in MRv1)
  • A client program submits the application
    ResourceManager allocates a specified container to start the ApplicationMaster
    ApplicationMaster, on boot-up, registers with ResourceManager
    ApplicationMaster negotiates with ResourceManager for appropriate resource containers
    On successful container allocations, ApplicationMaster contacts NodeManager to launch the container
    Application code is executed within the container, and then ApplicationMaster is responded with the execution status
    During execution, the client communicates directly with ApplicationMaster or ResourceManager to get status, progress updates etc.
    Once the application is complete, ApplicationMaster unregisters with ResourceManager and shuts down, allowing its own
  • ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distributed Operating System

    1. 1. Apache Hadoop YARN: The Next- generation Distributed Operating System Zhijie Shen & Jian He @ Hortonworks
    2. 2. About Us ● Two hats o Software Engineer @ Hortonworks, Inc. o Hadoop Committer @ The Apache Foundation ● We’re doing Apache YARN!
    3. 3. Agenda ● What Is YARN ● YARN Basics ● Recent Development ● Writing Your YARN Applications
    4. 4. What is YARN (1) Frequent questions on mailing list: ● What is Hadoop 2.0? ● What is YARN? Hadoop 2.0 = YARN = the operating systems to run various distributed data processing applications
    5. 5. What is YARN (2) - Motivation ● Hadoop is 8yr old ● World is significantly changed o The hardware o diverse data processing model: interaction, streaming
    6. 6. What is YARN (3) - Motivation Flexibility - Enabling data processing model more than MapReduce
    7. 7. What is YARN (5) - Motivation ● Scalability - Splitting resource management and job life cycle management and mointoring ● Efficiency - Improve cluster utilization o distinct map/reduce slots on a host
    8. 8. What Is YARN (4) - Motivation ● Resource Sharing - Multiple workloads in cluster
    9. 9. What Is YARN (5) - Status ● YARN is shipped ● YARN is going to be better ● Hadoop 2.4 is coming this month! ● ResourceManager high availability ● Application historic data services ● Long-running applications optimization
    10. 10. What Is YARN (6) - Status YARN ecosystem
    11. 11. What is YARN (7) - Status YARN community Active committers working on YARN 10+ Active other contributors 20+ Total Jira tickets 1800+ Jira tickets that are resolved/closed 1100+
    12. 12. Agenda ● What Is YARN ● YARN Basics ● Recent Development ● Writing Your YARN Applications
    13. 13. YARN Basics (1) - Concept ● ResourceManager - Master of a cluster ● NodeManager - Slave to take care of one host ● ApplicationMaster - Master of an application ● Container - Resource abstraction, process to complete a task
    14. 14. YARN Basics (2) - RM Component interfacing RM to the clients: ● ClientRMService Component interacting with the per-application AMs: ● ApplicationMasterService Component connecting RM to the nodes: ● ResourceTrackerService Core of the ResourceManager ● Scheduler, managing apps
    15. 15. YARN Basics (3) - NM Component for NM-RM communication: ● NodeStatusUpdater Core component managing containers on the node: ● ContainerManager Component interacting with OS to start/stop the container process: ● ContainerExecutor
    16. 16. YARN Basics (4) - Workflow Execution Sequence: 1. Client submits an application 2. RM allocates a container to start AM 3. AM registers with RM 4. AM asks containers from RM 5. AM notifies NM to launch containers 6. Application code is executed in container 7. Client contacts RM/AM to monitor application’s status 8. AM unregisters with RM Client RM NM AM 1 2 3 4 5 7 8 6
    17. 17. YARN Basics (5) - Scheduler ● FIFO Simply first come, first serve ● Fair o evenly share the resource across queues, among applications ● Capacity o allowing certain resource capacity to queues
    18. 18. Agenda ● What Is YARN ● YARN Basics ● Recent Development ● Writing Your YARN Applications
    19. 19. Apache Hadoop release ● Hadoop 2.4 is coming out very soon ● ResourceManager high availability o RM Restart o RM failover ● Application Historic Data Service o MRv1 JobHisotoryServer.
    20. 20. ResourceManager High Availability ● RM is potentially a single point of failure. o Restart for various reasons: Bugs, hardware failures, deliberate down- time for upgrades ● Goal: RM down-time invisible to end-users. ● Includes two parts: o RM Restart o RM Failover
    21. 21. ResourceManager Restart ● RMState includes: o static state (appSubmissionContext, credentials) , belongs to RM o running state - per AM  scheduler state, i.e. per-app scheduling state, resource consumptions ● Overly complex in MRv1 for the fact that JobTracker has to save too much information: both static state and running state. ● YARN RM only persists static state, because AM and RM are running on different machines on YARN o Persist application submission metadata and gathers running state from AM and NM
    22. 22. ResourceManager FailOver ● Active/Standby o Leader election (ZooKeeper) ● Standby on transition to Active loads all the state from the state store. ● NM, AM, clients, redirect to the new RM o RMProxy
    23. 23. Application Historic Data Service ● Motivation: MRv1 JobHisotoryServer o MR job writes history data into HDFS. o When job finishes and removes from memory, Clients request will be redirected to JHS. o Reads the this persistent history data in a on-demand way to serve the requests.
    24. 24. Application Historic Data Service ● Goal: Solve the application-history problem in a generic and more scalable way. ● A single application history server to serve all applications’ requests. ● ResourceManager records generic application information o Application, ApplicationAttempt, Container ● ApplicationMaster writes framework specific information o Free for users to define ● Multiple interfaces to inquiry the historic information o RPC, RESTful Services o History server Web UI
    25. 25. Application Historic Server
    26. 26. Long Running Service(YARN-896) ● YARN is intended to be generic purpose. o Short-lived (e.g.MR) apps vs long-lived apps ● Goal: disruptive impact minimum to running applications. o Work-preserving AM restart.  Single point of failure for application’s point of view  Not killing containers  AM rebind with previous running containers. o Work-preserving NM restart.  containers process  Previous running containers rebind with new NM.
    27. 27. Agenda ● What Is YARN ● YARN Basics ● Recent Development ● Writing Your YARN Applications
    28. 28. Write a Yarn Applicaition
    29. 29. Write client and AM Client ● Client – RM --- YarnClient o submitApplication ApplicationMaster ● AM – RM --- AMRMClient o registerApplicationMaster o allocate o finishApplicationMaster ● AM – NM --- NMClient o startContainer o getContainerStatus o stopContainer
    30. 30. Writing the YarnClient 1. Get the application Id from RM 1. Construct ApplicationSubmissionContext a. Shell command to run the AM b. Environment (class path, env-variable) c. LocalResources (Job jars downloaded from HDFS) 1. Submit the request to RM a. submitApplication
    31. 31. Example: client submit an application 1. YarnClient yarnClient = YarnClient.createYarnClient(); 1. YarnClientApplication app = yarnClient.createApplication(); 1. appContext = app.getAppSubmissionContext() …….client logic ….. set command: "$JAVA_HOME/bin/java" + "ApplicationMaster" + command set environments: class path etc. set jars: ………………………….. 1. yarnClient.submitApplication(appContext);
    32. 32. Writing ApplicationMaster ● The ApplicationMaster is started. ● In AM code, for each task, to start the container, we repeat a similar procedure as starting the master. o construct ContainerLaunchContext  commands  env  jars
    33. 33. Writing ApplicationMaster 1. AM registers with RM (registerApplicationMaster) 1. HeartBeats(allocate) with RM (asynchronously) a. send the Request i. Request new containers. ii. Release containers. b. Received containers and send request to NM to start the container 1. Unregisters with RM (finishApplicationMaster)
    34. 34. Writing ApplicationMaster ● RM assigns containers asynchronously ● Containers are likely not returned immediately at current call. ● User needs to give empty requests until it gets the containers it requested. ● ResourceRequest is incremental.
    35. 35. ContainerManagementProtocol 1. Start container (task) on NM a. startContainer 1. Monitor the container(task) status a. getContainerStatus 1. Stop the the container (task) a. stopContainer
    36. 36. Example - AM: request containers AMRMClient rmClient = AMRMClient.createAMRMClient(); ContainerRequest containerAsk = new ContainerRequest(capability) rmClient.addContainerRequest(containerAsk); Wait until get the containers: do { AllocateResponse response = rmClient.allocate(0); Thread.sleep(100); } while (response.getAllocatedContainers().size() > 0)
    37. 37. Example - AM: start/stop containers NMClient nmClient = NMClient.createNMClient(); for (Container container : response.getAllocatedContainers()) { ContainerLaunchContext ctx = ContainerLaunchContext.newInstace(); ctx.setCommands(command); ctx.setEnvironment(envs); nmClient.startContainer(container, ctx); } -monitor container: nmClient.getContainerStatus(containerId) -stop container: nmClient.stopContainer(ContainerId)
    38. 38. Takeaway 1. YARN is a resource-management platform that can host different data- processing frameworks (beyond MapReduce). o No longer limited by MR, (batch process, interactive query, stream) o All applications can now co-exist and share the same data on HDFS. 1. YARN is in production now. o Yahoo, Ebay. 1. We welcome your contributions. o patches, tests, writing applications on YARN o https://issues.apache.org/jira/browse/YARN o https://github.com/hortonworks/simple-yarn-app
    39. 39. The book is out ! http://yarn-book.com/ http://yarn-book.com/

    ×