An Introduction to Apache Hadoop Yarn

Apache Hadoop Yarn
● What is Yarn
● Problems with Hadoop
● What does Yarn Do ?
● Old Architecture
● New Architecture
● Yarn Example
● Additions

Hadoop Yarn – What is it ?
● Next Generation MapReduce MRv2
● Split Job Tracker into
– Resource Manager
– Scheduling / Monitoring
● Improves scaling
● Improves resource management
● Already used by Yahoo

Problems with Hadoop 1.0
● Problems with large scaling
– > 4000 nodes
– > 40k concurrent tasks
● Problems with resource utilization
● Slots only for Map or Reduce
● Single NameNode, single point of failure
● Clients and Cluster must be at same version

What does Yarn do ?
● Provides a cluster level resource manager
● Adds application level resource management
● Provides slots for jobs other than Map / Reduce
● Improves resource utilization

Old Architecture
● Cluster level Job Tracker, Task Tracker on data node

New Architecture
● Resource Manager
– Cluster level resource manager
– Long life
● Node Manager
– One per data server
– Monitors resources on node
● Application Master
– One per application
– Short life
– Manages task / scheduling

Yarn Example
● 1) Client -> Resource Manager
– Submit App Master
● 2) Resource Manager -> Node Manager
– Start App Master
● 3) Application Master -> Resource Manager
– Request and release containers
● 4) Resource Manager -> Node Manager
– Start tasks in containers

Additions
● Consider Weave
– Simplifies the use of Yarn
– Reduced development effort
– Simplified API

Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems

An Introduction to Apache Hadoop Yarn

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to An Introduction to Apache Hadoop Yarn

Similar to An Introduction to Apache Hadoop Yarn (20)

More from Mike Frampton

More from Mike Frampton (20)

Recently uploaded

Recently uploaded (20)

An Introduction to Apache Hadoop Yarn