© Hortonworks Inc. 2013
YARN and AmbariYARN service management using Ambari
Srimanth Gunturi
September 25th, 2013
Page 1
© Hortonworks Inc. 2013
Agenda
• YARN Overview
• Installing
• Monitoring
• Configuration
• Capacity Scheduler
• MapReduce2
• Future
Page 2
Architecting the Future of Big Data
© Hortonworks Inc. 2013
YARN Overview
Page 3
Architecting the Future of Big Data
• Yet Another Resource Negotiator (YARN)
• Purpose - Negotiating resources (Memory, CPU, Disk, etc.) on a cluster
• What was wrong with original MapReduce?
– Cluster not open to non-MapReduce paradigms
– Inefficient resource utilization (Map slots, Reduce slots)
– Upgrade rigidity
• MapReduce (Hadoop 1.0) -> YARN + MapReduce 2 (Hadoop 2.0)
© Hortonworks Inc. 2013
YARN Overview - Applications
Page 4
Architecting the Future of Big Data
• MapReduce1 applications are fully compatible with MapReduce2
– Same JARs can be used
– Binary compatibility with org.apache.hadoop.mapred API
– Source compatibility with org.apache.hadoop.mapreduce API
© Hortonworks Inc. 2013
YARN Overview - Architecture
Page 5
Architecting the Future of Big Data
• ResourceManager
• NodeManagers
• Containers
• Applications (ApplicationMasters)
© Hortonworks Inc. 2013
Installing
Page 6
Architecting the Future of Big Data
© Hortonworks Inc. 2013
Monitoring
Page 7
Architecting the Future of Big Data
© Hortonworks Inc. 2013
Monitoring – NodeManager Summary
Page 8
Architecting the Future of Big Data
NodeManagers Status
• Active – In communication with RM
• Lost – Not communicating with RM
• Unhealthy – Flagged by custom health
check script identified by property
yarn.nodemanager.health-checker.script.path
• Rebooted – Automatically restarted due
to internal problems
• Decommissioned – RM ignoring
communications from host. Host placed
in yarn.resourcemanager.nodes.exclude-path
file.
© Hortonworks Inc. 2013
Monitoring – Container Summary
Page 9
Architecting the Future of Big Data
Containers
• Allocated – Containers which have
been created with requested resources.
• Pending – Containers, whose resources
will become available and are pending
creation.
• Reserved – Containers, whose
resources are not yet available.
Examples
10 GB Cluster
• Request three 5GB containers
• 2 allocated, 1 pending.
• Request three 4GB containers
• 2 allocated, 1 reserved (2GB)
© Hortonworks Inc. 2013
Monitoring – Applications Summary
Page 10
Architecting the Future of Big Data
Applications
• Submitted – Application requests made
to YARN.
• Running – Application with Masters
which have been created and are running.
• Pending – Application requests which
are pending creation.
• Completed – Applications which have
completed running. They could have
been successful, killed or failed.
• Killed – Applications which have been
terminated by user
• Failed – Applications which have failed
to run due to internal failures.
© Hortonworks Inc. 2013
Monitoring – Memory Summary
Page 11
Architecting the Future of Big Data
Cluster Memory
• Used – Memory resource currently
being used across the cluster
• Reserved – Memory resources that
are set aside for being allocated.
• Total – Memory resource available
across entire cluster
© Hortonworks Inc. 2013
Monitoring – Alerts
Page 12
Architecting the Future of Big Data
Service Alerts
- ResourceManager health
- % NodeManagers alive
Host Alerts
- NodeManager health
- NodeManager process check
© Hortonworks Inc. 2013
Monitoring – Graphs
Page 13
Architecting the Future of Big Data
© Hortonworks Inc. 2013
Configuration
Page 14
Architecting the Future of Big Data
© Hortonworks Inc. 2013
Configuration – Capacity Scheduler
Page 15
Architecting the Future of Big Data
Queues
Root
o A
o B
o C
o C1
o C2
o default
• Hierarchical Queues
• Capacity Guarantees
• Capacity (%)
• Maximum-am-resource-percent (%)
• Elasticity
• Maximum-capacity (%)
• Access Control
© Hortonworks Inc. 2013
MapReduce2
Page 16
Architecting the Future of Big Data
YARN-321: Generic application history service
© Hortonworks Inc. 2013
Future
Page 17
Architecting the Future of Big Data
• Support more YARN applications
• Improve per application-type information
• Improve Capacity Scheduler configuration
• Better health checks

Ambari Meetup: YARN

  • 1.
    © Hortonworks Inc.2013 YARN and AmbariYARN service management using Ambari Srimanth Gunturi September 25th, 2013 Page 1
  • 2.
    © Hortonworks Inc.2013 Agenda • YARN Overview • Installing • Monitoring • Configuration • Capacity Scheduler • MapReduce2 • Future Page 2 Architecting the Future of Big Data
  • 3.
    © Hortonworks Inc.2013 YARN Overview Page 3 Architecting the Future of Big Data • Yet Another Resource Negotiator (YARN) • Purpose - Negotiating resources (Memory, CPU, Disk, etc.) on a cluster • What was wrong with original MapReduce? – Cluster not open to non-MapReduce paradigms – Inefficient resource utilization (Map slots, Reduce slots) – Upgrade rigidity • MapReduce (Hadoop 1.0) -> YARN + MapReduce 2 (Hadoop 2.0)
  • 4.
    © Hortonworks Inc.2013 YARN Overview - Applications Page 4 Architecting the Future of Big Data • MapReduce1 applications are fully compatible with MapReduce2 – Same JARs can be used – Binary compatibility with org.apache.hadoop.mapred API – Source compatibility with org.apache.hadoop.mapreduce API
  • 5.
    © Hortonworks Inc.2013 YARN Overview - Architecture Page 5 Architecting the Future of Big Data • ResourceManager • NodeManagers • Containers • Applications (ApplicationMasters)
  • 6.
    © Hortonworks Inc.2013 Installing Page 6 Architecting the Future of Big Data
  • 7.
    © Hortonworks Inc.2013 Monitoring Page 7 Architecting the Future of Big Data
  • 8.
    © Hortonworks Inc.2013 Monitoring – NodeManager Summary Page 8 Architecting the Future of Big Data NodeManagers Status • Active – In communication with RM • Lost – Not communicating with RM • Unhealthy – Flagged by custom health check script identified by property yarn.nodemanager.health-checker.script.path • Rebooted – Automatically restarted due to internal problems • Decommissioned – RM ignoring communications from host. Host placed in yarn.resourcemanager.nodes.exclude-path file.
  • 9.
    © Hortonworks Inc.2013 Monitoring – Container Summary Page 9 Architecting the Future of Big Data Containers • Allocated – Containers which have been created with requested resources. • Pending – Containers, whose resources will become available and are pending creation. • Reserved – Containers, whose resources are not yet available. Examples 10 GB Cluster • Request three 5GB containers • 2 allocated, 1 pending. • Request three 4GB containers • 2 allocated, 1 reserved (2GB)
  • 10.
    © Hortonworks Inc.2013 Monitoring – Applications Summary Page 10 Architecting the Future of Big Data Applications • Submitted – Application requests made to YARN. • Running – Application with Masters which have been created and are running. • Pending – Application requests which are pending creation. • Completed – Applications which have completed running. They could have been successful, killed or failed. • Killed – Applications which have been terminated by user • Failed – Applications which have failed to run due to internal failures.
  • 11.
    © Hortonworks Inc.2013 Monitoring – Memory Summary Page 11 Architecting the Future of Big Data Cluster Memory • Used – Memory resource currently being used across the cluster • Reserved – Memory resources that are set aside for being allocated. • Total – Memory resource available across entire cluster
  • 12.
    © Hortonworks Inc.2013 Monitoring – Alerts Page 12 Architecting the Future of Big Data Service Alerts - ResourceManager health - % NodeManagers alive Host Alerts - NodeManager health - NodeManager process check
  • 13.
    © Hortonworks Inc.2013 Monitoring – Graphs Page 13 Architecting the Future of Big Data
  • 14.
    © Hortonworks Inc.2013 Configuration Page 14 Architecting the Future of Big Data
  • 15.
    © Hortonworks Inc.2013 Configuration – Capacity Scheduler Page 15 Architecting the Future of Big Data Queues Root o A o B o C o C1 o C2 o default • Hierarchical Queues • Capacity Guarantees • Capacity (%) • Maximum-am-resource-percent (%) • Elasticity • Maximum-capacity (%) • Access Control
  • 16.
    © Hortonworks Inc.2013 MapReduce2 Page 16 Architecting the Future of Big Data YARN-321: Generic application history service
  • 17.
    © Hortonworks Inc.2013 Future Page 17 Architecting the Future of Big Data • Support more YARN applications • Improve per application-type information • Improve Capacity Scheduler configuration • Better health checks

Editor's Notes

  • #4 References: http://hortonworks.com/blog/apache-hadoop-yarn-background-and-an-overview/
  • #5 http://hortonworks.com/hadoop/yarn/http://hortonworks.com/blog/running-existing-applications-on-hadoop-2-yarn/
  • #7 - MapReduce2 becomes bound to YARN. - MR2 is currently the only application exposed in Ambari
  • #15 Properties grouped by type.Properties end up in either core-site.xml, yarn-site.xml or capacity-scheduler.xml/etc/hadoop/conf/
  • #16 Maximum-am-resource-percent. % of queue capacity allocated to AppMasters. Determines number of concurrent applications.