SlideShare a Scribd company logo
1 of 34
Download to read offline
Intro	
  to	
  YARN	
  Apps	
  
Sandy	
  Ryza	
  
Introduc4on	
  

•  What’s YARN?
•  YARN apps
•  Building YARN apps
The	
  OS	
  analogy	
  
Traditional Operating System

Storage:
File System

Execution/Scheduling:
Processes/Kernel
Scheduler
The	
  OS	
  analogy	
  
Hadoop

Storage:
Hadoop Distributed
File System (HDFS)

Execution/Scheduling:
YARN!
Goal:	
  Mul4tenancy	
  

•  Different types of applications on the same
• 

cluster
Different users and organizations on the
same cluster
ResourceManager	
  (RM)	
  

•  Central service that tracks

• 

o  Nodes
§  Resources
o  Applications
o  Containers
Houses scheduler, which is in charge of all
container placement decisions
NodeManager	
  (NM)	
  

•  One on every node
•  Launches container processes
•  Enforces resource allocations
•  Monitors liveliness
Applica4on	
  Master	
  (AM)	
  

•  User/application code
•  Every application instance has one
•  Runs inside a container on the cluster
•  Requests resources from ResourceManager
YARN	
  
JobHistory
Server

ResourceManager

NodeManager

Container
Map Task

Client

NodeManager

Container
Application
Master

Container
Reduce Task
Processing	
  Frameworks	
  /	
  YARN	
  apps	
  

•  MapReduce
• 
• 
• 

o  Batch processing, fault tolerant
Impala
o  Low latency SQL on Hadoop
Spark
o  Load data into memory, great for iterative
algorithms
Storm
o  Stream processing
YARN	
  app	
  models	
  
• 

Applica4on	
  master	
  (AM)	
  per	
  job	
  
Most	
  simple	
  for	
  batch	
  
•  Used	
  by	
  MapReduce	
  
• 
YARN	
  app	
  models	
  
• 

Applica4on	
  master	
  per	
  session	
  
Runs	
  mul4ple	
  jobs	
  on	
  behalf	
  of	
  the	
  same	
  user	
  
•  Recently	
  added	
  in	
  Tez	
  
•  Spark	
  interac4ve	
  mode	
  
• 
YARN	
  app	
  models	
  
• 

Singleton	
  AM	
  as	
  permanent	
  service	
  
Always	
  on,	
  waits	
  around	
  for	
  jobs	
  to	
  come	
  in	
  
•  Used	
  for	
  Impala	
  
• 
YARN/MR	
  Scheduling	
  

ResourceManage
r

Fair Scheduler
Decide which jobs to give resources to

MapReduce
Application Master

Decide which tasks to give
resources to within a job
Scheduling	
  on	
  Hadoop	
  
Application
Master 1
ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
I want 2 containers
with 1024 MB and a
1 core each

Application
Master 1

ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
Application
Master 1

Noted
ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
Application
Master 1
ResourceManager

Application
Master 2

I’m still
here

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
I’ll reserve
some
space on
node1 for
AM1

Application
Master 1
ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
Got anything for
me?

Application
Master 1

ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
Here’s a security
token to let you launch
a container on Node 1

Application
Master 1

ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
Hey, launch my
container with this
shell command

Application
Master 1

ResourceManager

Application
Master 2

Node 1

Node 2

Node 3
Scheduling	
  on	
  Hadoop	
  
Application
Master 1
ResourceManager

Application
Master 2

Node 1
Node 2
Container

Node 3
Should you build a YARN app?

•  MapReduce can’t run arbitrary DAGs?
o  Use Spark
Should you build a YARN app?

•  MapReduce can’t store data in memory?
o  Use Spark
Should you build a YARN app?

•  Iterative processing?
o  Use Spark
Should you build a YARN app?

•  Have an existing distributed app that runs all
tasks at once?
o  Use distributed shell
When to build a YARN app

•  Allocating and releasing containers
• 

dynamically
Weird scheduling requirements
o  Gang
o  Complex locality
What YARN does for you

•  Deploys your bits
•  Runs your processes
•  Monitors your processes
•  Kills your processes when they misbehave
What YARN does not do for you

•  Communication between your processes
AMRMClientAsync
CallbackHandler handler = new CallbackHandler() {
public void onContainersAllocated(List<Container> containers) {
for (Container container : containers) {
startTask(container);
}
}
[... more methods]
}
AMRMClientAsync amClient = AMRMClientAsync.createAMRMClientAsync(1000,
handler);
amClient.registerApplicationMaster(NetUtils.getHostName(), -1, “”);
amClient.addContainerRequest(
new ContainerRequest(
Resource.newInstance(1024, 1),
new String[] {“node1”, “node2”}, new String[] {“rack1”},
Priority.newInstance(2)));
NMClientAsync
CallbackHandler nmHandler = new CallbackHandler() {
[... listen for containers stopped and started]
}
NMClientAsync nmClient = NMClientAsync.createNMClientAsync(nmHandler);
Launching Containers
public void startContainer(Container container) {
ContainerLaunchContext launchContext =
ContainerLaunchContext.newInstance(
localResources,
environment,
Arrays.asList(“sleep 1000”),
serviceData,
tokens,
acls);
nmClient.startContainerAsync(container, launchContext);
}
Local resources
Node
Container

Container

Node
Container

file.txt

Container

file.txt

file.txt
HDFS

More Related Content

What's hot

Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2Cloudera, Inc.
 
Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2DataWorks Summit
 
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureApache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureDataWorks Summit
 
Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9Rohit Agrawal
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARNAdam Kawa
 
Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Hortonworks
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Spark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop ClusterSpark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop ClusterDataWorks Summit
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceHortonworks
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupRommel Garcia
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Emilio Coppa
 
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and FutureHadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and FutureVinod Kumar Vavilapalli
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceUwe Printz
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersDataWorks Summit
 

What's hot (20)

Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2
 
Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2Yarns about YARN: Migrating to MapReduce v2
Yarns about YARN: Migrating to MapReduce v2
 
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureApache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
 
Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9
 
Hadoop YARN
Hadoop YARN Hadoop YARN
Hadoop YARN
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
 
Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Spark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop ClusterSpark-on-YARN: Empower Spark Applications on Hadoop Cluster
Spark-on-YARN: Empower Spark Applications on Hadoop Cluster
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduce
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 
Hadoop YARN overview
Hadoop YARN overviewHadoop YARN overview
Hadoop YARN overview
 
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and FutureHadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduce
 
Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN Clusters
 

Viewers also liked

Building Applications on YARN
Building Applications on YARNBuilding Applications on YARN
Building Applications on YARNChris Riccomini
 
Hw09 Cloudera Desktop In Detail
Hw09   Cloudera Desktop In DetailHw09   Cloudera Desktop In Detail
Hw09 Cloudera Desktop In DetailCloudera, Inc.
 
The Future of Data
The Future of DataThe Future of Data
The Future of Datablynnbuckley
 
Spark tuning2016may11bida
Spark tuning2016may11bidaSpark tuning2016may11bida
Spark tuning2016may11bidaAnya Bida
 
Cloudera introduction
Cloudera introductionCloudera introduction
Cloudera introductionPhate334
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerCloudera, Inc.
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopDavid Yahalom
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebookNiranjan Pandey
 
Hadoop & Cloudera Workshop
Hadoop & Cloudera WorkshopHadoop & Cloudera Workshop
Hadoop & Cloudera WorkshopSerkan Sakınmaz
 
Building a REST Job Server for Interactive Spark as a Service
Building a REST Job Server for Interactive Spark as a ServiceBuilding a REST Job Server for Interactive Spark as a Service
Building a REST Job Server for Interactive Spark as a ServiceCloudera, Inc.
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopCloudera, Inc.
 
Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2IMC Institute
 

Viewers also liked (17)

Building Applications on YARN
Building Applications on YARNBuilding Applications on YARN
Building Applications on YARN
 
Hw09 Cloudera Desktop In Detail
Hw09   Cloudera Desktop In DetailHw09   Cloudera Desktop In Detail
Hw09 Cloudera Desktop In Detail
 
The Future of Data
The Future of DataThe Future of Data
The Future of Data
 
451 Research Impact Report
451 Research Impact Report451 Research Impact Report
451 Research Impact Report
 
Spark tuning2016may11bida
Spark tuning2016may11bidaSpark tuning2016may11bida
Spark tuning2016may11bida
 
Cloudera introduction
Cloudera introductionCloudera introduction
Cloudera introduction
 
Yarns About Yarn
Yarns About YarnYarns About Yarn
Yarns About Yarn
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera Hadoop
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebook
 
Hadoop & Cloudera Workshop
Hadoop & Cloudera WorkshopHadoop & Cloudera Workshop
Hadoop & Cloudera Workshop
 
Cloudera Desktop
Cloudera DesktopCloudera Desktop
Cloudera Desktop
 
Building a REST Job Server for Interactive Spark as a Service
Building a REST Job Server for Interactive Spark as a ServiceBuilding a REST Job Server for Interactive Spark as a Service
Building a REST Job Server for Interactive Spark as a Service
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Hadoop I/O Analysis
Hadoop I/O AnalysisHadoop I/O Analysis
Hadoop I/O Analysis
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache Hadoop
 
Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2
 

Similar to Introduction to YARN Apps

[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史Insight Technology, Inc.
 
Hadoop Map-Reduce from the subject: Big Data Analytics
Hadoop Map-Reduce from the subject: Big Data AnalyticsHadoop Map-Reduce from the subject: Big Data Analytics
Hadoop Map-Reduce from the subject: Big Data AnalyticsRUHULAMINHAZARIKA
 
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...Zhijie Shen
 
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Hakka Labs
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Tsuyoshi OZAWA
 
Hadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch trainingHadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch trainingNandan Kumar
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Tsuyoshi OZAWA
 
YARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOPYARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOPOmkar Joshi
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop EcosystemLior Sidi
 
A Secure Public Cache For YARN Application Resources
A Secure Public Cache For YARN Application ResourcesA Secure Public Cache For YARN Application Resources
A Secure Public Cache For YARN Application Resourcesctrezzo
 
Hadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceHadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceUwe Printz
 
Hadoop Summit 2015: A Secure Public Cache For YARN Application Resources
Hadoop Summit 2015: A Secure Public Cache For YARN Application ResourcesHadoop Summit 2015: A Secure Public Cache For YARN Application Resources
Hadoop Summit 2015: A Secure Public Cache For YARN Application Resourcesctrezzo
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
 

Similar to Introduction to YARN Apps (20)

[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
 
Hadoop Map-Reduce from the subject: Big Data Analytics
Hadoop Map-Reduce from the subject: Big Data AnalyticsHadoop Map-Reduce from the subject: Big Data Analytics
Hadoop Map-Reduce from the subject: Big Data Analytics
 
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
 
YARN
YARNYARN
YARN
 
Yarnthug2014
Yarnthug2014Yarnthug2014
Yarnthug2014
 
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
 
Yarn
YarnYarn
Yarn
 
Hadoop fault-tolerance
Hadoop fault-toleranceHadoop fault-tolerance
Hadoop fault-tolerance
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
 
Hadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch trainingHadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch training
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
 
YARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOPYARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOP
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Introduction to yarn
Introduction to yarnIntroduction to yarn
Introduction to yarn
 
A Secure Public Cache For YARN Application Resources
A Secure Public Cache For YARN Application ResourcesA Secure Public Cache For YARN Application Resources
A Secure Public Cache For YARN Application Resources
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Hadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceHadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduce
 
Hadoop Summit 2015: A Secure Public Cache For YARN Application Resources
Hadoop Summit 2015: A Secure Public Cache For YARN Application ResourcesHadoop Summit 2015: A Secure Public Cache For YARN Application Resources
Hadoop Summit 2015: A Secure Public Cache For YARN Application Resources
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 

Introduction to YARN Apps

  • 1. Intro  to  YARN  Apps   Sandy  Ryza  
  • 2. Introduc4on   •  What’s YARN? •  YARN apps •  Building YARN apps
  • 3. The  OS  analogy   Traditional Operating System Storage: File System Execution/Scheduling: Processes/Kernel Scheduler
  • 4. The  OS  analogy   Hadoop Storage: Hadoop Distributed File System (HDFS) Execution/Scheduling: YARN!
  • 5. Goal:  Mul4tenancy   •  Different types of applications on the same •  cluster Different users and organizations on the same cluster
  • 6. ResourceManager  (RM)   •  Central service that tracks •  o  Nodes §  Resources o  Applications o  Containers Houses scheduler, which is in charge of all container placement decisions
  • 7. NodeManager  (NM)   •  One on every node •  Launches container processes •  Enforces resource allocations •  Monitors liveliness
  • 8. Applica4on  Master  (AM)   •  User/application code •  Every application instance has one •  Runs inside a container on the cluster •  Requests resources from ResourceManager
  • 10. Processing  Frameworks  /  YARN  apps   •  MapReduce •  •  •  o  Batch processing, fault tolerant Impala o  Low latency SQL on Hadoop Spark o  Load data into memory, great for iterative algorithms Storm o  Stream processing
  • 11. YARN  app  models   •  Applica4on  master  (AM)  per  job   Most  simple  for  batch   •  Used  by  MapReduce   • 
  • 12. YARN  app  models   •  Applica4on  master  per  session   Runs  mul4ple  jobs  on  behalf  of  the  same  user   •  Recently  added  in  Tez   •  Spark  interac4ve  mode   • 
  • 13. YARN  app  models   •  Singleton  AM  as  permanent  service   Always  on,  waits  around  for  jobs  to  come  in   •  Used  for  Impala   • 
  • 14. YARN/MR  Scheduling   ResourceManage r Fair Scheduler Decide which jobs to give resources to MapReduce Application Master Decide which tasks to give resources to within a job
  • 15. Scheduling  on  Hadoop   Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 16. Scheduling  on  Hadoop   I want 2 containers with 1024 MB and a 1 core each Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 17. Scheduling  on  Hadoop   Application Master 1 Noted ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 18. Scheduling  on  Hadoop   Application Master 1 ResourceManager Application Master 2 I’m still here Node 1 Node 2 Node 3
  • 19. Scheduling  on  Hadoop   I’ll reserve some space on node1 for AM1 Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 20. Scheduling  on  Hadoop   Got anything for me? Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 21. Scheduling  on  Hadoop   Here’s a security token to let you launch a container on Node 1 Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 22. Scheduling  on  Hadoop   Hey, launch my container with this shell command Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Node 3
  • 23. Scheduling  on  Hadoop   Application Master 1 ResourceManager Application Master 2 Node 1 Node 2 Container Node 3
  • 24. Should you build a YARN app? •  MapReduce can’t run arbitrary DAGs? o  Use Spark
  • 25. Should you build a YARN app? •  MapReduce can’t store data in memory? o  Use Spark
  • 26. Should you build a YARN app? •  Iterative processing? o  Use Spark
  • 27. Should you build a YARN app? •  Have an existing distributed app that runs all tasks at once? o  Use distributed shell
  • 28. When to build a YARN app •  Allocating and releasing containers •  dynamically Weird scheduling requirements o  Gang o  Complex locality
  • 29. What YARN does for you •  Deploys your bits •  Runs your processes •  Monitors your processes •  Kills your processes when they misbehave
  • 30. What YARN does not do for you •  Communication between your processes
  • 31. AMRMClientAsync CallbackHandler handler = new CallbackHandler() { public void onContainersAllocated(List<Container> containers) { for (Container container : containers) { startTask(container); } } [... more methods] } AMRMClientAsync amClient = AMRMClientAsync.createAMRMClientAsync(1000, handler); amClient.registerApplicationMaster(NetUtils.getHostName(), -1, “”); amClient.addContainerRequest( new ContainerRequest( Resource.newInstance(1024, 1), new String[] {“node1”, “node2”}, new String[] {“rack1”}, Priority.newInstance(2)));
  • 32. NMClientAsync CallbackHandler nmHandler = new CallbackHandler() { [... listen for containers stopped and started] } NMClientAsync nmClient = NMClientAsync.createNMClientAsync(nmHandler);
  • 33. Launching Containers public void startContainer(Container container) { ContainerLaunchContext launchContext = ContainerLaunchContext.newInstance( localResources, environment, Arrays.asList(“sleep 1000”), serviceData, tokens, acls); nmClient.startContainerAsync(container, launchContext); }