SlideShare a Scribd company logo
YARN
Yet Another Resource Negotiator
CC BY 2.0 / Richard Bumgardner	
  
Been there, done that.	
  
Agenda
•  Why	
  YARN?	
  
•  YARN	
  Architecture	
  and	
  Concepts	
  
•  Resources	
  &	
  Scheduling	
  
–  Capacity	
  Scheduler	
  
–  Fair	
  Scheduler	
  
•  Configuring	
  the	
  Fair	
  Scheduler	
  
•  Managing	
  Running	
  Jobs	
  
Agenda
•  Why	
  YARN?	
  
•  YARN	
  Architecture	
  and	
  Concepts	
  
•  Resources	
  &	
  Scheduling	
  
–  Capacity	
  Scheduler	
  
–  Fair	
  Scheduler	
  
•  Configuring	
  the	
  Fair	
  Scheduler	
  
•  Managing	
  Running	
  Jobs	
  
The 1st Generation of Hadoop: Batch
HADOOP 1.0
Built for Web-Scale Batch Apps
Single App	
  
BATCH
HDFS
Single App	
  
INTERACTIVE
Single App	
  
BATCH
HDFS
•  All other usage
patterns must
leverage that same
infrastructure
•  Forces the creation
of silos for managing
mixed workloads
Single App	
  
BATCH
HDFS
Single App	
  
ONLINE
Hadoop MapReduce Classic
•  JobTracker	
  
–  Manages	
  cluster	
  resources	
  and	
  job	
  scheduling	
  
•  TaskTracker	
  
–  Per-­‐node	
  agent	
  
–  Manage	
  tasks	
  
MapReduce Classic: Limitations
•  Scalability	
  
–  Maximum	
  Cluster	
  size	
  –	
  4,000	
  nodes	
  
–  Maximum	
  concurrent	
  tasks	
  –	
  40,000	
  
–  Coarse	
  synchronizaPon	
  in	
  JobTracker	
  
•  Availability	
  
–  Failure	
  kills	
  all	
  queued	
  and	
  running	
  jobs	
  
•  Hard	
  parPPon	
  of	
  resources	
  into	
  map	
  and	
  reduce	
  slots	
  
–  Low	
  resource	
  uPlizaPon	
  
•  Lacks	
  support	
  for	
  alternate	
  paradigms	
  and	
  services	
  
–  IteraPve	
  applicaPons	
  implemented	
  using	
  MapReduce	
  are	
  10x	
  slower	
  
Our Vision: Hadoop as Next-Gen Platform
MapReduce	
  
(cluster resource management	
  
& data processing)	
  
HDFS	
  
(redundant, reliable storage)	
  
Single Use System
Batch Apps
HADOOP 1.0
Multi Purpose Platform
Batch, Interactive, Online, Streaming, …
HADOOP 2.0
Others	
  
(data processing)	
  
YARN	
  
(cluster resource management)	
  
HDFS2	
  
(redundant, reliable storage)	
  
MapReduce	
  
(data processing)	
  
YARN: Talking Hadoop Beyond Batch
YARN (Cluster Resource Management)	
  
HDFS2 (Redundant, Reliable Storage)	
  
BATCH	
  
(MapReduce)	
  
INTERACTIVE	
  
(Tez)	
  
STREAMING	
  
(Storm, S4,…)	
  
GRAPH	
  
(Giraph)	
  
IN-­‐MEMORY	
  
(Spark)	
  
HPC MPI	
  
(OpenMPI)	
  
ONLINE	
  
(HBase)	
  
Store ALL DATA in one place…
Interact with that data in MULTIPLE WAYS
with Predictable Performance and Quality of Service
ApplicaRons Run NaRvely IN Hadoop	
  
OTHER	
  
(Search)
(Weave…)	
  
Why YARN / MR2 ?
•  	
  Scalability	
  
–  JobTracker	
  kept	
  track	
  of	
  individual	
  tasks	
  and	
  wouldn’t	
  scale	
  
•  UPlizaPon	
  
–  All	
  slots	
  are	
  equal	
  even	
  if	
  the	
  work	
  is	
  not	
  equal	
  
•  MulP-­‐tenancy	
  
–  Every	
  framework	
  shouldn’t	
  need	
  to	
  write	
  its	
  own	
  execuPon	
  engine	
  
–  All	
  frameworks	
  should	
  share	
  the	
  resources	
  on	
  a	
  cluster	
  
Multiple levels of scheduling
•  	
  YARN	
  
–  Which	
  applicaPon	
  (framework)	
  to	
  give	
  resources	
  to	
  ?	
  
•  ApplicaPon	
  (Framework	
  –	
  MR	
  etc.)	
  
–  Which	
  task	
  within	
  the	
  applicaPon	
  should	
  use	
  these	
  resources	
  ?	
  
Agenda
•  Why	
  YARN?	
  
•  YARN	
  Architecture	
  and	
  Concepts	
  
•  Resources	
  &	
  Scheduling	
  
–  Capacity	
  Scheduler	
  
–  Fair	
  Scheduler	
  
•  Configuring	
  the	
  Fair	
  Scheduler	
  
•  Managing	
  Running	
  Jobs	
  
YARN Concepts
•  ApplicaPon	
  
–  ApplicaPon	
  is	
  a	
  job	
  submi^ed	
  to	
  the	
  framework	
  
–  Example	
  –	
  Map	
  Reduce	
  Job	
  
•  Container	
  
–  Basic	
  unit	
  of	
  allocaPon	
  
–  Fine-­‐grained	
  resource	
  allocaPon	
  across	
  mulPple	
  resource	
  types	
  
(memory,	
  cpu,	
  disk,	
  network,	
  gpu	
  etc.)	
  
•  container_0	
  =	
  2GB,	
  1	
  CPU	
  
•  container_1	
  =	
  1GB,	
  6	
  CPU	
  
–  Replaces	
  the	
  fixed	
  map/reduce	
  slots	
  
YARN Architecture
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
Container 1.1	
  
Container 2.4	
  
ResourceManager	
  
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
Container 1.2	
  
Container 1.3	
  
AM 1	
  
Container 2.1	
  
Container 2.2	
  
Container 2.3	
  
AM2	
  
Scheduler	
  
-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  
ApplicaRons	
  
Manager	
  (AsM)	
  
Architecture
•  Resource	
  Manager	
  
–  Global	
  resource	
  scheduler	
  
–  Hierarchical	
  queues	
  
•  Node	
  Manager	
  
–  Per-­‐machine	
  agent	
  
–  Manages	
  the	
  life-­‐cycle	
  of	
  container	
  
–  Container	
  resource	
  monitoring	
  
•  ApplicaPon	
  Master	
  
–  Per-­‐applicaPon	
  
–  Manages	
  applicaPon	
  scheduling	
  and	
  task	
  execuPon	
  
–  E.g.	
  MapReduce	
  ApplicaPon	
  Master	
  
Design Centre
•  Split	
  up	
  the	
  two	
  major	
  funcPons	
  of	
  JobTracker	
  
–  Cluster	
  resource	
  management	
  
–  ApplicaPon	
  life-­‐cycle	
  management	
  
•  MapReduce	
  becomes	
  user-­‐land	
  library	
  
YARN Architecture - Walkthrough
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
Container 1.1	
  
Container 2.4	
  
ResourceManager	
  
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
Container 1.2	
  
Container 1.3	
  
AM 1	
  
Container 2.2	
  
Container 2.1	
  
Container 2.3	
  
AM2	
  
Scheduler	
  
Client2	
  
Control Flow: Submit application
Control Flow: Get application updates
Control Flow: AM asking for resources
Control Flow: AM using containers
Execution Modes
•  Local	
  mode	
  
•  Uber	
  mode	
  
Container Types
•  DefaultContainerExecutor	
  
–  Unix’s	
  process-­‐based	
  Executor	
  by	
  using	
  ulimit	
  
•  LinuxContainerExecutor	
  
–  Linux	
  container-­‐based	
  Executor	
  by	
  using	
  cgroups	
  
•  Choose	
  it	
  based	
  on	
  isolaPon	
  level	
  you	
  need	
  
Agenda
•  Why	
  YARN?	
  
•  YARN	
  Architecture	
  and	
  Concepts	
  
•  Resources	
  &	
  Scheduling	
  
–  Capacity	
  Scheduler	
  
–  Fair	
  Scheduler	
  
•  Configuring	
  the	
  Fair	
  Scheduler	
  
•  Managing	
  Running	
  Jobs	
  
Resource Model and Capacities
•  Resource	
  vectors	
  
–  e.g.	
  1024	
  MB,	
  2	
  vcores,	
  …	
  
–  No	
  more	
  task	
  slots!	
  
•  Nodes	
  specify	
  the	
  amount	
  of	
  resources	
  they	
  have	
  
–  yarn.nodemanager.resource.memory-­‐mb	
  
–  yarn.nodemanager.resource.cpu-­‐vcores	
  
•  vcores	
  to	
  cores	
  relaPon,	
  not	
  really	
  “virtual”	
  
Resources and Scheduling
•  What	
  you	
  request	
  is	
  what	
  you	
  get	
  
–  No	
  more	
  fixed-­‐size	
  slots	
  
–  Framework/applicaPon	
  requests	
  resources	
  for	
  a	
  task	
  
•  MR	
  AM	
  requests	
  resources	
  for	
  map	
  and	
  reduce	
  tasks,	
  these	
  requests	
  can	
  
potenPally	
  be	
  for	
  different	
  amounts	
  of	
  resources	
  
YARN Scheduling
ResourceManager	
  
ApplicaPon	
  
Master	
  1	
  
ApplicaPon	
  
Master	
  2	
  
Node	
  1	
   Node	
  2	
   Node	
  3	
  
I want 2 containers
with 1024 MB and a
1 core each	
  Noted	
  
I’m	
  sPll	
  
here	
  
I’ll
reserve
some
space on
node1
for AM1	
   Got anything
for me?	
  
Here’s a security
token to let you
launch a container
on Node 1	
  
Hey, launch my
container with this
shell command	
  
Container	
  
YARN Schedulers
•  Same	
  as	
  MR1	
  
•  FIFO	
  Scheduler	
  
–  Processing	
  Jobs	
  in	
  order	
  
•  Fair	
  Scheduler	
  
–  Fair	
  to	
  all	
  users,	
  dominant	
  fair	
  scheduler	
  
•  Capacity	
  Scheduler	
  
–  Queue	
  shares	
  as	
  percentage	
  of	
  clusters	
  
–  FIFO	
  scheduling	
  within	
  each	
  queue	
  
–  SupporPng	
  preempPon	
  
•  Default	
  is	
  Capacity	
  Scheduler	
  
Capacity Scheduler
50%	
  
queue-­‐1	
   queue-­‐2	
   queue-­‐3	
  
Apps	
   Apps	
   Apps	
  
Guaranteed	
  
Resources	
  
30%	
   20%	
  
YARN Capacity Scheduler
•  ConfiguraPon	
  in	
  capacity-­‐scheduler.xml	
  
•  Take	
  some	
  Pme	
  to	
  setup	
  your	
  queues!	
  
•  Queues	
  have	
  per-­‐queue	
  ACLs	
  to	
  restrict	
  queue	
  access	
  
–  Access	
  can	
  be	
  dynamically	
  changed	
  
•  ElasPcity	
  can	
  be	
  limited	
  on	
  a	
  per-­‐queue	
  basis	
  
–  use	
  yarn.scheduler.capacity.<queue-­‐path>.maximum-­‐capacity	
  
•  Use	
  yarn.scheduler.capacity.<queue-­‐path>.state	
  to	
  drain	
  
queues	
  
–  ‘Decommissioning’	
  a	
  queue	
  
•  yarn	
  rmadmin	
  –refreshQueues	
  to	
  make	
  runPme	
  changes	
  
YARN Fair Scheduler
•  The	
  Fair	
  Scheduler	
  is	
  the	
  default	
  YARN	
  scheduler	
  in	
  CDH5	
  
•  The	
  only	
  YARN	
  scheduler	
  that	
  Cloudera	
  recommends	
  for	
  
producPon	
  clusters	
  
•  Provides	
  fine-­‐grained	
  resource	
  allocaPon	
  for	
  mulPple	
  
resource	
  types	
  
–  Memory	
  (by	
  default)	
  
–  CPU	
  (opPonal)	
  
Goals of the Fair Scheduler
•  Should	
  allow	
  short	
  interacPve	
  jobs	
  to	
  coexist	
  with	
  long	
  
producPon	
  jobs	
  
•  Should	
  allow	
  resources	
  to	
  be	
  controlled	
  proporPonally	
  
•  Should	
  ensure	
  that	
  the	
  cluster	
  is	
  efficiently	
  uPlized	
  
The Fair Scheduler
•  The	
  Fair	
  Scheduler	
  promotes	
  fairness	
  between	
  schedulable	
  
enPPes	
  
•  The	
  Fair	
  Scheduler	
  awards	
  resources	
  to	
  pools	
  that	
  are	
  most	
  
underserved	
  
–  Gives	
  a	
  container	
  to	
  the	
  pool	
  that	
  has	
  the	
  fewest	
  resources	
  allocated	
  
Fair Scheduler Pools
•  Each	
  job	
  is	
  assigned	
  to	
  a	
  pool	
  
–  Also	
  known	
  as	
  a	
  queue	
  in	
  YARN	
  
terminology	
  
•  All	
  pools	
  in	
  YARN	
  descend	
  from	
  the	
  
root	
  pool	
  
•  Physical	
  resource	
  are	
  not	
  bound	
  to	
  
any	
  specific	
  pool	
  
•  Pools	
  can	
  be	
  predefined	
  or	
  defined	
  
dynamically	
  by	
  specifying	
  a	
  pool	
  
name	
  when	
  you	
  submit	
  a	
  job	
  
•  Pools	
  and	
  subpools	
  are	
  defined	
  in	
  
the	
  fair-scheduler.xml file	
  
Total:	
  30GB
Alice Bob
15GB15GB
In Which Pool Will a Job Run
•  The	
  default	
  pool	
  for	
  a	
  job	
  is	
  root.username
–  For	
  example,	
  root.Alice	
  and	
  root.Bob
–  You	
  can	
  drop	
  root	
  when	
  referring	
  to	
  a	
  pool	
  
•  For	
  example,	
  you	
  can	
  refer	
  to	
  root.Alice	
  simply	
  as	
  Alice
•  Jobs	
  can	
  be	
  assigned	
  to	
  arbitrarily-­‐named	
  pools	
  
–  To	
  specify	
  the	
  pool	
  name	
  when	
  submirng	
  a	
  MapReduce	
  job,	
  use	
  
•  -D mapreduce.job.queuename
When Will a Job Run Within a Pool?
•  The	
  Fair	
  Scheduler	
  grants	
  resources	
  to	
  a	
  pool,	
  but	
  which	
  job’s	
  
task	
  will	
  get	
  resources?	
  
•  The	
  policies	
  for	
  assigning	
  resources	
  to	
  jobs	
  within	
  a	
  pool	
  are	
  
defined	
  in	
  fair-scheduler.xml
•  The	
  Fair	
  Scheduler	
  uses	
  three	
  techniques	
  for	
  prioriPzing	
  jobs	
  
within	
  pools:	
  	
  
–  Single	
  resource	
  fairness	
  
–  Dominant	
  resource	
  fairness	
  
–  FIFO	
  
•  You	
  can	
  also	
  configure	
  the	
  Fair	
  Scheduler	
  to	
  delay	
  assignment	
  
of	
  resources	
  when	
  a	
  preferred	
  rack	
  or	
  node	
  is	
  not	
  available	
  
Single Resource Fairness
•  Single	
  resource	
  fairness	
  
–  Is	
  the	
  default	
  Fair	
  Scheduler	
  policy	
  
–  Schedules	
  jobs	
  using	
  memory	
  
•  Example	
  
–  Two	
  pools:	
  Alice	
  has	
  15GB	
  allocated,	
  and	
  Bob	
  has	
  5GB	
  
–  Both	
  pools	
  request	
  a	
  10GB	
  container	
  of	
  memory	
  
–  Bob	
  has	
  less	
  resources	
  and	
  will	
  be	
  granted	
  the	
  next	
  10GB	
  that	
  
becomes	
  available	
  
Total:	
  30GB
Alice Bob
10GB
15GB
5GB
Adding Pools Redistributes Resources
•  The	
  user	
  Charlie	
  now	
  submits	
  a	
  job	
  to	
  a	
  new	
  pool	
  
–  Resource	
  allocaPons	
  are	
  adjusted	
  
–  Each	
  pool	
  receives	
  a	
  fair	
  share	
  of	
  cluster	
  resources	
  
Total:	
  30GB
Alice Bob Charlie
10GB 10GB10GB
Determining the Fair Share
•  The	
  fair	
  share	
  of	
  resources	
  assigned	
  to	
  the	
  pool	
  is	
  based	
  on	
  
–  The	
  total	
  resources	
  available	
  across	
  the	
  cluster	
  
–  The	
  number	
  of	
  pools	
  compePng	
  for	
  cluster	
  resources	
  
•  Excess	
  cluster	
  capacity	
  is	
  spread	
  across	
  all	
  pools	
  
–  The	
  aim	
  is	
  to	
  maintain	
  the	
  most	
  even	
  allocaPon	
  possible	
  so	
  every	
  pool	
  
receives	
  its	
  fair	
  share	
  of	
  resources	
  
•  The	
  fair	
  share	
  will	
  never	
  be	
  higher	
  than	
  the	
  actual	
  demand	
  
•  Pools	
  can	
  use	
  more	
  than	
  their	
  fair	
  share	
  when	
  other	
  pools	
  are	
  
not	
  in	
  need	
  of	
  resources	
  
–  This	
  happens	
  when	
  there	
  are	
  no	
  tasks	
  eligible	
  to	
  run	
  in	
  other	
  pools	
  
Minimum Resources
•  A	
  pool	
  with	
  minimum	
  resources	
  defined	
  receives	
  priority	
  
during	
  resource	
  allocaPon	
  
•  The	
  minimum	
  resources,	
  minResources,	
  are	
  the	
  minimum	
  
amount	
  of	
  resources	
  that	
  must	
  be	
  allocated	
  to	
  the	
  pool	
  prior	
  
to	
  fair	
  share	
  allocaPon	
  
–  Minimum	
  resources	
  are	
  allocated	
  to	
  each	
  pool	
  assuming	
  there	
  is	
  
cluster	
  capacity	
  
–  Pools	
  that	
  have	
  minimum	
  resources	
  specified	
  will	
  receive	
  priority	
  in	
  
resource	
  assignment	
  
Minimum Resource Allocation Example
•  First,	
  fill	
  up	
  the	
  Production	
  pool	
  to	
  the	
  20GB	
  minimum	
  
guarantee	
  
•  Then	
  distribute	
  the	
  remaining	
  10GB	
  evenly	
  across	
  Alice	
  and	
  
Bob
Total:	
  30GB
ProducPon BobAlice
Demand:	
  	
  	
  	
  	
  	
  	
  	
  100GB	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Demand:	
  30GB 	
  	
  Demand:	
  25GB
minResources:	
  20GB
5GB 5GB
20GB
Minimum Resource Allocation Example 2:
Production Pool Empty
•  Production	
  has	
  no	
  demand,	
  so	
  no	
  resources	
  are	
  allocated	
  
to	
  it	
  
•  All	
  resources	
  are	
  allocated	
  evenly	
  between	
  Alice	
  and	
  Bob
Total:	
  30GB
ProducPon
15GB
BobAlice
Demand:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  0GB	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Demand:	
  30GB 	
  	
  Demand:	
  25GB
minResources:	
  20GB
15GB
•  Combined	
  minResources	
  of	
  Production	
  and	
  Research	
  
exceed	
  capacity	
  
•  Minimum	
  resources	
  are	
  assigned	
  proporPonally	
  based	
  on	
  
defined	
  minResources	
  unPl	
  available	
  resources	
  are	
  exhausted	
  
•  No	
  memory	
  remains	
  for	
  pools	
  without	
  minResources	
  defined	
  
(i.e.,Bob)	
  
Minimum Resource Allocation Example 3:
MinResources Exceed Resources
Total:	
  30GB
ProducPon BobResearch
Demand:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  100GB	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Demand:	
  30GB 	
  	
  	
  Demand:	
  25GB
minResources:	
  	
  	
  	
  50GB	
  	
  	
  	
  	
  	
  	
  	
  	
  minResources:	
  25GB
20GB
10GB
•  Production	
  is	
  filled	
  to	
  minResources	
  
•  Remaining	
  25GB	
  is	
  distributed	
  across	
  all	
  pools	
  
•  Production	
  pool	
  receives	
  more	
  than	
  its	
  minResources,	
  to	
  
maintain	
  fairness	
  
Minimum Resource Allocation Example 4:
MinResources < Fair Share
Total:	
  30GB
ProducPon BobAlice
Demand:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  100GB	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Demand:	
  30GB 	
  	
  	
  Demand:	
  25GB
minResources:	
  	
  	
  	
  5GB
10GB 10GB10GB
Pools with Weights
•  Instead	
  of	
  (or	
  in	
  addiPon	
  to)	
  serng	
  minResources,	
  pools	
  can	
  
be	
  assigned	
  a	
  weight	
  
•  Pools	
  with	
  higher	
  weight	
  receive	
  more	
  resources	
  during	
  
allocaPon	
  
•  ‘Even	
  water	
  glass	
  height’	
  analogy:	
  
–  Think	
  of	
  the	
  weight	
  as	
  controlling	
  the	
  ‘width’	
  of	
  the	
  glass	
  
Example: Pool with Double Weight
•  Production	
  is	
  filled	
  to	
  minResources	
  (5Gb)	
  
•  Remaining	
  25GB	
  is	
  distributed	
  across	
  all	
  pools	
  
•  Bob	
  pool	
  receives	
  twice	
  the	
  amount	
  of	
  memory	
  during	
  fair	
  
share	
  allocaPon	
  
Total:	
  30GB
ProducPon BobAlice
Demand:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  100GB	
  	
  	
  	
  Demand:	
  30GB	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Demand:	
  25GB
minResources:	
  	
  	
  	
  5GB 	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Weight:	
  2
8GB 14GB8GB
Dominant Resource Fairness
•  The	
  Fair	
  Scheduler	
  can	
  be	
  configured	
  to	
  schedule	
  with	
  both	
  
memory	
  and	
  CPU	
  using	
  dominant	
  resource	
  fairness	
  
•  Scenario	
  #1:	
  
–  Alice	
  has	
  6GB	
  and	
  3	
  cores,	
  and	
  Bob	
  has	
  4GB	
  and	
  2	
  cores	
  –	
  which	
  
pool	
  receives	
  the	
  next	
  resource	
  allocaPon?	
  
•  Bob	
  will	
  receive	
  the	
  next	
  container	
  because	
  it	
  has	
  less	
  
memory	
  and	
  less	
  CPU	
  cores	
  allocated	
  than	
  Alice	
  
6GB
3	
  cores
4GB
2	
  cores
Alice	
  Usage Bob	
  Usage
Dominant Resource Fairness Example
•  Scenario	
  #2:	
  
–  A	
  cluster	
  has	
  10GB	
  of	
  total	
  memory	
  and	
  20	
  cores	
  
–  Pool	
  Alice	
  has	
  containers	
  granted	
  for	
  4GB	
  of	
  memory	
  and	
  5	
  cores	
  
–  Pool	
  Bob	
  has	
  containers	
  granted	
  for	
  1GB	
  of	
  memory	
  and	
  10	
  cores	
  
•  Alice	
  will	
  receive	
  the	
  next	
  container	
  because	
  its	
  40%	
  
dominant	
  share	
  of	
  memory	
  is	
  less	
  than	
  the	
  Bob	
  pool’s	
  50%	
  
dominant	
  share	
  of	
  CPU	
  	
  
4GB	
  
40%	
  
capacity
5	
  cores	
  
25%	
  
capacity
1GB	
  
10%	
  
capacity
10	
  cores	
  
50%	
  
capacity
Alice	
  Usage Bob	
  Usage
Achieving Fair Share: The Patient Approach
•  If	
  shares	
  are	
  imbalanced,	
  pools	
  which	
  are	
  over	
  their	
  fair	
  share	
  
may	
  not	
  assign	
  new	
  tasks	
  when	
  their	
  old	
  ones	
  complete	
  
–  Those	
  resources	
  then	
  become	
  available	
  to	
  pools	
  which	
  are	
  operaPng	
  
below	
  their	
  fair	
  share	
  
•  However,	
  waiPng	
  paPently	
  for	
  a	
  task	
  in	
  another	
  pool	
  to	
  finish	
  
may	
  not	
  be	
  acceptable	
  in	
  a	
  producPon	
  environment	
  
–  Tasks	
  could	
  take	
  a	
  long	
  Pme	
  to	
  complete	
  
Achieving Fair Share: The Brute Force Approach
•  With	
  preempPon	
  enabled,	
  the	
  Fair	
  Scheduler	
  acPvely	
  kills	
  
tasks	
  that	
  belong	
  to	
  pools	
  operaPng	
  over	
  their	
  fair	
  share	
  
–  Pools	
  operaPng	
  below	
  fair	
  share	
  receive	
  those	
  reaped	
  resources	
  
•  There	
  are	
  two	
  types	
  of	
  preempPon	
  available	
  
–  Minimum	
  share	
  preempPon	
  
–  Fair	
  share	
  preempPon	
  
•  PreempPon	
  code	
  avoids	
  killing	
  a	
  task	
  in	
  a	
  pool	
  if	
  it	
  would	
  
cause	
  that	
  pool	
  to	
  begin	
  preempPng	
  tasks	
  in	
  other	
  pools	
  
–  This	
  prevents	
  a	
  potenPally	
  endless	
  cycle	
  of	
  pools	
  killing	
  one	
  another’s	
  
tasks	
  
Minimum Share Preemption
•  Pools	
  with	
  a	
  minResources	
  configured	
  are	
  operaPng	
  on	
  an	
  
SLA	
  (Service	
  Level	
  Agreement)	
  
•  Pools	
  that	
  are	
  below	
  their	
  minimum	
  share	
  as	
  defined	
  by	
  
minResources	
  can	
  preempt	
  tasks	
  in	
  other	
  pools	
  
–  Set	
  minSharePreemptionTimeout	
  to	
  the	
  number	
  of	
  seconds	
  
the	
  pool	
  is	
  under	
  its	
  minimum	
  share	
  before	
  preempPon	
  should	
  begin	
  
–  Default	
  is	
  infinite	
  (Java’s	
  Long.MAX_VALUE)	
  
Fair Share Preemption
•  Pools	
  not	
  receiving	
  their	
  fair	
  share	
  can	
  preempt	
  tasks	
  in	
  other	
  
pools	
  
–  Only	
  pools	
  that	
  exceed	
  their	
  fair	
  share	
  are	
  candidates	
  for	
  preempPon	
  
•  Use	
  fair	
  share	
  preempPon	
  conservaPvely	
  
–  Set	
  fairSharePreemptionTimeout	
  to	
  the	
  number	
  of	
  seconds	
  a	
  
pool	
  is	
  under	
  fair	
  share	
  before	
  preempPon	
  should	
  begin	
  
–  Default	
  is	
  infinite	
  (Java’s	
  Long.MAX_VALUE)	
  
Agenda
•  Why	
  YARN?	
  
•  YARN	
  Architecture	
  and	
  Concepts	
  
•  Resources	
  &	
  Scheduling	
  
–  Capacity	
  Scheduler	
  
–  Fair	
  Scheduler	
  
•  Configuring	
  the	
  Fair	
  Scheduler	
  
•  Managing	
  Running	
  Jobs	
  
Configuring Fair Scheduler Capabilities (1)
•  yarn.scheduler.fair.allow-­‐undeclared-­‐pools	
  (yarn-­‐site.xml)	
  
–  When	
  true,	
  new	
  pools	
  can	
  be	
  created	
  at	
  applicaPon	
  submission	
  Pme	
  
or	
  by	
  the	
  user-­‐as-­‐default-­‐queue	
  property.	
  When	
  false,	
  submirng	
  to	
  a	
  
pool	
  that	
  is	
  not	
  specified	
  in	
  the	
  fair-scheduler.xml file	
  causes	
  
the	
  applicaPon	
  to	
  be	
  placed	
  in	
  the	
  “default”	
  pool.	
  Default:	
  true.	
  
Ignored	
  if	
  a	
  pool	
  placement	
  policy	
  is	
  defined	
  in	
  the	
  fair-
scheduler.xml file.	
  
•  yarn.scheduler.fair.preempPon	
  (yarn-­‐site.xml)	
  
–  Enables	
  preempPon	
  in	
  Fair	
  Scheduler.	
  Set	
  to	
  true	
  if	
  you	
  have	
  pools	
  
that	
  must	
  operate	
  on	
  an	
  SLA.	
  Default:	
  false.	
  
•  yarn.scheduler.fair.user-­‐as-­‐default-­‐queue	
  (yarn-­‐site.xml)	
  
–  Send	
  jobs	
  to	
  pools	
  based	
  on	
  users’	
  names	
  instead	
  of	
  to	
  the	
  default	
  
pool,	
  root.default.	
  Default:	
  true	
  
Configuring Fair Scheduler Capabilities (2)
•  yarn.scheduler.fair.locality.threshold.node	
  	
  	
  、	
  
yarn.scheduler.fair.locality.threshold.rack	
  (yarn-­‐site.xml)	
  
–  For	
  applicaPon	
  that	
  request	
  containers	
  on	
  parPcular	
  nodes	
  or	
  racks,	
  
the	
  number	
  of	
  scheduling	
  opportuniPes	
  since	
  the	
  last	
  container	
  
assignment	
  to	
  wait	
  before	
  accepPng	
  a	
  placement	
  on	
  another	
  node.	
  
Expressed	
  as	
  a	
  float	
  between	
  0	
  and	
  1,	
  which,	
  as	
  a	
  fracPon	
  of	
  the	
  
cluster	
  size,	
  is	
  the	
  number	
  of	
  scheduling	
  opportuniPes	
  to	
  pass	
  up..	
  
Default:	
  1	
  (don’t	
  pass	
  up	
  any	
  scheduling	
  opportuniPes)	
  	
  	
  
•  Example:	
  yarn.scheduler.fair.locality.threshold.node	
  =	
  0.02,	
  
cluster	
  size	
  =	
  100	
  nodes.	
  At	
  most	
  2	
  scheduling	
  opportuniPes	
  can	
  be	
  skipped	
  when	
  
preferred	
  placement	
  cannot	
  be	
  met.	
  
Configuring Resource Allocation for Pools and Users (1)
•  You	
  configure	
  Fair	
  Scheduler	
  pools	
  in	
  the	
  /etc/hadoop/
conf/fair-scheduler.xml	
  file	
  
•  The	
  Fair	
  Scheduler	
  rereads	
  this	
  file	
  every	
  10	
  seconds	
  
–  ResourceManager	
  restart	
  is	
  not	
  required	
  when	
  the	
  file	
  changes	
  
•  The	
  fair-scheduler.xml file	
  must	
  contain	
  an	
  
<allocations> element	
  
•  Use	
  the	
  <queue> element	
  to	
  configure	
  resource	
  allocaPon	
  
for	
  a	
  pool	
  
•  Use	
  the	
  <user> element	
  to	
  configure	
  resource	
  allocaPon	
  
for	
  a	
  user	
  across	
  mulPple	
  pools	
  
Configuring Resource Allocation for Pools and Users (2)
•  To	
  specify	
  resource	
  allocaPons,	
  use	
  the	
  <queue> or	
  
<user> element	
  with	
  any	
  or	
  all	
  of	
  the	
  the	
  following	
  
subelements	
  
–  <minResources>
•  The	
  minimum	
  resources	
  to	
  which	
  the	
  pool	
  is	
  enPtled	
  
•  Format	
  is	
  x mb, y vcores
•  Example:	
  10000mb, 5 vcores
–  <maxResources>
•  The	
  maximum	
  resources	
  to	
  which	
  the	
  pool	
  is	
  enPtled	
  
•  Format	
  is	
  x mb, y vcores
Configuring Resource Allocation for Pools and Users (3)
•  AddiPonal	
  sub-­‐elements	
  of	
  <queue> or	
  <user> to	
  use	
  
when	
  specifying	
  resource	
  allocaPons	
  
–  <maxRunningApps>
•  The	
  maximum	
  applicaPons	
  in	
  the	
  pool	
  that	
  can	
  be	
  run	
  concurrently	
  
–  <weight>
•  Used	
  for	
  non-­‐proporPonate	
  sharing	
  with	
  other	
  pools	
  
•  The	
  default	
  is	
  1	
  
–  <minSharePreemptionTimeout>
•  Time	
  to	
  wait	
  before	
  pre-­‐empPng	
  tasks	
  
–  <schedulingPolicy>
•  SRF	
  for	
  single	
  resource	
  fairness	
  (the	
  default)	
  
•  SRF	
  for	
  dominant	
  resource	
  fairness	
  
•  FIFO	
  for	
  first-­‐in,	
  first-­‐out	
  
fair-scheduler.xml Example (1)
•  Allow	
  users	
  to	
  run	
  three	
  jobs,	
  but	
  allow	
  Bob	
  to	
  run	
  six	
  jobs	
  
<?xml	
  version=“1.0”?>	
  
<allocaPons>	
  
	
  	
  	
  	
  <userMaxAppsDefault>3</userMaxAppsDefault>	
  
	
  	
  	
  	
  <user	
  name=“bob”>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <maxRunningApps>6</maxRunningApps>	
  
	
  	
  	
  	
  </user>	
  
</allocaPons>	
  
fair-scheduler.xml Example (2)
•  Add	
  a	
  fair	
  share	
  Pmeout	
  
<?xml	
  version=“1.0”?>	
  
<allocaPons>	
  
	
  	
  	
  	
  <userMaxAppsDefault>3</userMaxAppsDefault>	
  
	
  	
  	
  	
  <user	
  name=“bob”>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <maxRunningApps>6</maxRunningApps>	
  
	
  	
  	
  	
  </user>	
  
	
  	
  	
  	
  <fairSharePreempPonTimeout>300</fairSharePreempPonTimeout>	
  
</allocaPons>	
  
fair-scheduler.xml Example (3)
•  Define	
  the	
  producPon	
  pool	
  with	
  a	
  weight	
  of	
  2	
  and	
  a	
  resource	
  
allocaPon	
  of	
  10000	
  MB	
  and	
  1	
  core	
  
<?xml	
  version=“1.0”?>	
  
<allocaPons>	
  
	
  	
  	
  	
  <userMaxAppsDefault>3</userMaxAppsDefault>	
  
	
  	
  	
  	
  <queue	
  name=“producPon”>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <minResources>10000	
  mb,	
  1	
  vcores</minResources>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <weight>2.0</weight>	
  
	
  	
  	
  	
  </queue	
  >	
  
</allocaPons>	
  
fair-scheduler.xml Example (4)
•  Add	
  an	
  SLA	
  to	
  the	
  producPon	
  pool	
  
<?xml	
  version=“1.0”?>	
  
<allocaPons>	
  
	
  	
  	
  	
  <userMaxAppsDefault>3</userMaxAppsDefault>	
  
	
  	
  	
  	
  <queue	
  name=“producPon”>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <minResources>10000	
  mb,	
  1	
  vcores</minResources>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <weight>2.0</weight>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <minSharePreempPonTimeout>60</minSharePreempPonTimeout>	
  
	
  	
  	
  	
  </queue	
  >	
  
</allocaPons>	
  
The Fair Scheduler User Interface
•  h^p://<resource_manager_host>:8088/cluster/scheduler	
  
Agenda
•  Why	
  YARN?	
  
•  YARN	
  Architecture	
  and	
  Concepts	
  
•  Resources	
  &	
  Scheduling	
  
–  Capacity	
  Scheduler	
  
–  Fair	
  Scheduler	
  
•  Configuring	
  the	
  Fair	
  Scheduler	
  
•  Managing	
  Running	
  Jobs	
  
Displaying Jobs
•  To	
  view	
  jobs	
  currently	
  running	
  on	
  the	
  cluster	
  
–  yarn application –list
–  List	
  all	
  running	
  jobs,	
  including	
  the	
  applicaPon	
  ID	
  for	
  each	
  
•  To	
  view	
  all	
  jobs	
  on	
  the	
  cluster,	
  including	
  completed	
  jobs	
  
–  yarn application –list all
•  To	
  display	
  the	
  status	
  of	
  an	
  individual	
  job	
  
–  yarn application –status <application_ID>
•  You	
  can	
  also	
  use	
  the	
  ResourceManager	
  Web	
  UI,	
  Hue,	
  Ambari,	
  
Cloudera	
  Manager	
  to	
  display	
  jobs	
  
Killing Jobs
•  It	
  is	
  important	
  to	
  note	
  that	
  once	
  a	
  user	
  has	
  submi^ed	
  a	
  job,	
  
they	
  can	
  not	
  stop	
  it	
  just	
  by	
  hirng	
  CTRL-­‐C	
  on	
  their	
  terminal	
  
–  This	
  stops	
  job	
  output	
  appearing	
  on	
  the	
  user’s	
  console	
  
–  The	
  job	
  is	
  sPll	
  running	
  on	
  the	
  cluster!	
  
•  To	
  kill	
  a	
  job	
  running	
  on	
  the	
  cluster	
  
–  yarn application –kill <application_ID>
•  You	
  can	
  also	
  kill	
  job	
  from	
  Cloudera	
  Manager	
  
目录
•  aaa	
  

More Related Content

What's hot

Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
Adam Kawa
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Apache Hive Tutorial
Apache Hive TutorialApache Hive Tutorial
Apache Hive Tutorial
Sandeep Patil
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdf
Amit Raj
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
Arinto Murdopo
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
Arvind Kumar
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
SANG WON PARK
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka
MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka
MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka
Edureka!
 
Hive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationHive Data Modeling and Query Optimization
Hive Data Modeling and Query Optimization
Eyad Garelnabi
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Databricks
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
StampedeCon
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
Yaroslav Tkachenko
 
Yarn
YarnYarn
Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
Milind Bhandarkar
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
Knoldus Inc.
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon
 

What's hot (20)

Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Apache Hive Tutorial
Apache Hive TutorialApache Hive Tutorial
Apache Hive Tutorial
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdf
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka
MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka
MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka
 
Hive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationHive Data Modeling and Query Optimization
Hive Data Modeling and Query Optimization
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
 
Yarn
YarnYarn
Yarn
 
Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 

Viewers also liked

An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
Mike Frampton
 
Hadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspectiveHadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspective
Joydeep Sen Sarma
 
Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
Subhas Kumar Ghosh
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Govt.Engineering college, Idukki
 
Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Ryu Kobayashi
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Hortonworks
 

Viewers also liked (6)

An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
 
Hadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspectiveHadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspective
 
Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
 
Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 

Similar to Yarn

Hadoop bangalore-meetup-dec-2011-hadoop nextgen
Hadoop bangalore-meetup-dec-2011-hadoop nextgenHadoop bangalore-meetup-dec-2011-hadoop nextgen
Hadoop bangalore-meetup-dec-2011-hadoop nextgen
InMobi
 
Apache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's NextApache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's Next
DataWorks Summit
 
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHortonworks
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hortonworks
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
Joseph Niemiec
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Stanley Wang
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Stanley Wang
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Cloudera, Inc.
 
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
Insight Technology, Inc.
 
Hadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch trainingHadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch training
Nandan Kumar
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Tsuyoshi OZAWA
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learning
ojavajava
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Mac Moore
 
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Tsuyoshi OZAWA
 
Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1
Sandeep Kunkunuru
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Hortonworks
 
Introduction to Yarn
Introduction to YarnIntroduction to Yarn
Introduction to Yarn
Apache Apex
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopHortonworks
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioning
Stanley Wang
 
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with YarnScale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
David Kaiser
 

Similar to Yarn (20)

Hadoop bangalore-meetup-dec-2011-hadoop nextgen
Hadoop bangalore-meetup-dec-2011-hadoop nextgenHadoop bangalore-meetup-dec-2011-hadoop nextgen
Hadoop bangalore-meetup-dec-2011-hadoop nextgen
 
Apache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's NextApache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's Next
 
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
 
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
 
Hadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch trainingHadoop 2.0 yarn arch training
Hadoop 2.0 yarn arch training
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learning
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
 
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
 
Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Introduction to Yarn
Introduction to YarnIntroduction to Yarn
Introduction to Yarn
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioning
 
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with YarnScale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

Yarn

  • 2. CC BY 2.0 / Richard Bumgardner   Been there, done that.  
  • 3. Agenda •  Why  YARN?   •  YARN  Architecture  and  Concepts   •  Resources  &  Scheduling   –  Capacity  Scheduler   –  Fair  Scheduler   •  Configuring  the  Fair  Scheduler   •  Managing  Running  Jobs  
  • 4. Agenda •  Why  YARN?   •  YARN  Architecture  and  Concepts   •  Resources  &  Scheduling   –  Capacity  Scheduler   –  Fair  Scheduler   •  Configuring  the  Fair  Scheduler   •  Managing  Running  Jobs  
  • 5. The 1st Generation of Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App   BATCH HDFS Single App   INTERACTIVE Single App   BATCH HDFS •  All other usage patterns must leverage that same infrastructure •  Forces the creation of silos for managing mixed workloads Single App   BATCH HDFS Single App   ONLINE
  • 6. Hadoop MapReduce Classic •  JobTracker   –  Manages  cluster  resources  and  job  scheduling   •  TaskTracker   –  Per-­‐node  agent   –  Manage  tasks  
  • 7. MapReduce Classic: Limitations •  Scalability   –  Maximum  Cluster  size  –  4,000  nodes   –  Maximum  concurrent  tasks  –  40,000   –  Coarse  synchronizaPon  in  JobTracker   •  Availability   –  Failure  kills  all  queued  and  running  jobs   •  Hard  parPPon  of  resources  into  map  and  reduce  slots   –  Low  resource  uPlizaPon   •  Lacks  support  for  alternate  paradigms  and  services   –  IteraPve  applicaPons  implemented  using  MapReduce  are  10x  slower  
  • 8. Our Vision: Hadoop as Next-Gen Platform MapReduce   (cluster resource management   & data processing)   HDFS   (redundant, reliable storage)   Single Use System Batch Apps HADOOP 1.0 Multi Purpose Platform Batch, Interactive, Online, Streaming, … HADOOP 2.0 Others   (data processing)   YARN   (cluster resource management)   HDFS2   (redundant, reliable storage)   MapReduce   (data processing)  
  • 9. YARN: Talking Hadoop Beyond Batch YARN (Cluster Resource Management)   HDFS2 (Redundant, Reliable Storage)   BATCH   (MapReduce)   INTERACTIVE   (Tez)   STREAMING   (Storm, S4,…)   GRAPH   (Giraph)   IN-­‐MEMORY   (Spark)   HPC MPI   (OpenMPI)   ONLINE   (HBase)   Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service ApplicaRons Run NaRvely IN Hadoop   OTHER   (Search) (Weave…)  
  • 10. Why YARN / MR2 ? •   Scalability   –  JobTracker  kept  track  of  individual  tasks  and  wouldn’t  scale   •  UPlizaPon   –  All  slots  are  equal  even  if  the  work  is  not  equal   •  MulP-­‐tenancy   –  Every  framework  shouldn’t  need  to  write  its  own  execuPon  engine   –  All  frameworks  should  share  the  resources  on  a  cluster  
  • 11. Multiple levels of scheduling •   YARN   –  Which  applicaPon  (framework)  to  give  resources  to  ?   •  ApplicaPon  (Framework  –  MR  etc.)   –  Which  task  within  the  applicaPon  should  use  these  resources  ?  
  • 12. Agenda •  Why  YARN?   •  YARN  Architecture  and  Concepts   •  Resources  &  Scheduling   –  Capacity  Scheduler   –  Fair  Scheduler   •  Configuring  the  Fair  Scheduler   •  Managing  Running  Jobs  
  • 13. YARN Concepts •  ApplicaPon   –  ApplicaPon  is  a  job  submi^ed  to  the  framework   –  Example  –  Map  Reduce  Job   •  Container   –  Basic  unit  of  allocaPon   –  Fine-­‐grained  resource  allocaPon  across  mulPple  resource  types   (memory,  cpu,  disk,  network,  gpu  etc.)   •  container_0  =  2GB,  1  CPU   •  container_1  =  1GB,  6  CPU   –  Replaces  the  fixed  map/reduce  slots  
  • 14. YARN Architecture NodeManager   NodeManager   NodeManager   NodeManager   Container 1.1   Container 2.4   ResourceManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   Container 1.2   Container 1.3   AM 1   Container 2.1   Container 2.2   Container 2.3   AM2   Scheduler   -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐   ApplicaRons   Manager  (AsM)  
  • 15. Architecture •  Resource  Manager   –  Global  resource  scheduler   –  Hierarchical  queues   •  Node  Manager   –  Per-­‐machine  agent   –  Manages  the  life-­‐cycle  of  container   –  Container  resource  monitoring   •  ApplicaPon  Master   –  Per-­‐applicaPon   –  Manages  applicaPon  scheduling  and  task  execuPon   –  E.g.  MapReduce  ApplicaPon  Master  
  • 16. Design Centre •  Split  up  the  two  major  funcPons  of  JobTracker   –  Cluster  resource  management   –  ApplicaPon  life-­‐cycle  management   •  MapReduce  becomes  user-­‐land  library  
  • 17. YARN Architecture - Walkthrough NodeManager   NodeManager   NodeManager   NodeManager   Container 1.1   Container 2.4   ResourceManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   Container 1.2   Container 1.3   AM 1   Container 2.2   Container 2.1   Container 2.3   AM2   Scheduler   Client2  
  • 18. Control Flow: Submit application
  • 19. Control Flow: Get application updates
  • 20. Control Flow: AM asking for resources
  • 21. Control Flow: AM using containers
  • 22. Execution Modes •  Local  mode   •  Uber  mode  
  • 23. Container Types •  DefaultContainerExecutor   –  Unix’s  process-­‐based  Executor  by  using  ulimit   •  LinuxContainerExecutor   –  Linux  container-­‐based  Executor  by  using  cgroups   •  Choose  it  based  on  isolaPon  level  you  need  
  • 24. Agenda •  Why  YARN?   •  YARN  Architecture  and  Concepts   •  Resources  &  Scheduling   –  Capacity  Scheduler   –  Fair  Scheduler   •  Configuring  the  Fair  Scheduler   •  Managing  Running  Jobs  
  • 25.
  • 26. Resource Model and Capacities •  Resource  vectors   –  e.g.  1024  MB,  2  vcores,  …   –  No  more  task  slots!   •  Nodes  specify  the  amount  of  resources  they  have   –  yarn.nodemanager.resource.memory-­‐mb   –  yarn.nodemanager.resource.cpu-­‐vcores   •  vcores  to  cores  relaPon,  not  really  “virtual”  
  • 27. Resources and Scheduling •  What  you  request  is  what  you  get   –  No  more  fixed-­‐size  slots   –  Framework/applicaPon  requests  resources  for  a  task   •  MR  AM  requests  resources  for  map  and  reduce  tasks,  these  requests  can   potenPally  be  for  different  amounts  of  resources  
  • 28. YARN Scheduling ResourceManager   ApplicaPon   Master  1   ApplicaPon   Master  2   Node  1   Node  2   Node  3   I want 2 containers with 1024 MB and a 1 core each  Noted   I’m  sPll   here   I’ll reserve some space on node1 for AM1   Got anything for me?   Here’s a security token to let you launch a container on Node 1   Hey, launch my container with this shell command   Container  
  • 29. YARN Schedulers •  Same  as  MR1   •  FIFO  Scheduler   –  Processing  Jobs  in  order   •  Fair  Scheduler   –  Fair  to  all  users,  dominant  fair  scheduler   •  Capacity  Scheduler   –  Queue  shares  as  percentage  of  clusters   –  FIFO  scheduling  within  each  queue   –  SupporPng  preempPon   •  Default  is  Capacity  Scheduler  
  • 30. Capacity Scheduler 50%   queue-­‐1   queue-­‐2   queue-­‐3   Apps   Apps   Apps   Guaranteed   Resources   30%   20%  
  • 31. YARN Capacity Scheduler •  ConfiguraPon  in  capacity-­‐scheduler.xml   •  Take  some  Pme  to  setup  your  queues!   •  Queues  have  per-­‐queue  ACLs  to  restrict  queue  access   –  Access  can  be  dynamically  changed   •  ElasPcity  can  be  limited  on  a  per-­‐queue  basis   –  use  yarn.scheduler.capacity.<queue-­‐path>.maximum-­‐capacity   •  Use  yarn.scheduler.capacity.<queue-­‐path>.state  to  drain   queues   –  ‘Decommissioning’  a  queue   •  yarn  rmadmin  –refreshQueues  to  make  runPme  changes  
  • 32. YARN Fair Scheduler •  The  Fair  Scheduler  is  the  default  YARN  scheduler  in  CDH5   •  The  only  YARN  scheduler  that  Cloudera  recommends  for   producPon  clusters   •  Provides  fine-­‐grained  resource  allocaPon  for  mulPple   resource  types   –  Memory  (by  default)   –  CPU  (opPonal)  
  • 33. Goals of the Fair Scheduler •  Should  allow  short  interacPve  jobs  to  coexist  with  long   producPon  jobs   •  Should  allow  resources  to  be  controlled  proporPonally   •  Should  ensure  that  the  cluster  is  efficiently  uPlized  
  • 34. The Fair Scheduler •  The  Fair  Scheduler  promotes  fairness  between  schedulable   enPPes   •  The  Fair  Scheduler  awards  resources  to  pools  that  are  most   underserved   –  Gives  a  container  to  the  pool  that  has  the  fewest  resources  allocated  
  • 35. Fair Scheduler Pools •  Each  job  is  assigned  to  a  pool   –  Also  known  as  a  queue  in  YARN   terminology   •  All  pools  in  YARN  descend  from  the   root  pool   •  Physical  resource  are  not  bound  to   any  specific  pool   •  Pools  can  be  predefined  or  defined   dynamically  by  specifying  a  pool   name  when  you  submit  a  job   •  Pools  and  subpools  are  defined  in   the  fair-scheduler.xml file   Total:  30GB Alice Bob 15GB15GB
  • 36. In Which Pool Will a Job Run •  The  default  pool  for  a  job  is  root.username –  For  example,  root.Alice  and  root.Bob –  You  can  drop  root  when  referring  to  a  pool   •  For  example,  you  can  refer  to  root.Alice  simply  as  Alice •  Jobs  can  be  assigned  to  arbitrarily-­‐named  pools   –  To  specify  the  pool  name  when  submirng  a  MapReduce  job,  use   •  -D mapreduce.job.queuename
  • 37. When Will a Job Run Within a Pool? •  The  Fair  Scheduler  grants  resources  to  a  pool,  but  which  job’s   task  will  get  resources?   •  The  policies  for  assigning  resources  to  jobs  within  a  pool  are   defined  in  fair-scheduler.xml •  The  Fair  Scheduler  uses  three  techniques  for  prioriPzing  jobs   within  pools:     –  Single  resource  fairness   –  Dominant  resource  fairness   –  FIFO   •  You  can  also  configure  the  Fair  Scheduler  to  delay  assignment   of  resources  when  a  preferred  rack  or  node  is  not  available  
  • 38. Single Resource Fairness •  Single  resource  fairness   –  Is  the  default  Fair  Scheduler  policy   –  Schedules  jobs  using  memory   •  Example   –  Two  pools:  Alice  has  15GB  allocated,  and  Bob  has  5GB   –  Both  pools  request  a  10GB  container  of  memory   –  Bob  has  less  resources  and  will  be  granted  the  next  10GB  that   becomes  available   Total:  30GB Alice Bob 10GB 15GB 5GB
  • 39. Adding Pools Redistributes Resources •  The  user  Charlie  now  submits  a  job  to  a  new  pool   –  Resource  allocaPons  are  adjusted   –  Each  pool  receives  a  fair  share  of  cluster  resources   Total:  30GB Alice Bob Charlie 10GB 10GB10GB
  • 40. Determining the Fair Share •  The  fair  share  of  resources  assigned  to  the  pool  is  based  on   –  The  total  resources  available  across  the  cluster   –  The  number  of  pools  compePng  for  cluster  resources   •  Excess  cluster  capacity  is  spread  across  all  pools   –  The  aim  is  to  maintain  the  most  even  allocaPon  possible  so  every  pool   receives  its  fair  share  of  resources   •  The  fair  share  will  never  be  higher  than  the  actual  demand   •  Pools  can  use  more  than  their  fair  share  when  other  pools  are   not  in  need  of  resources   –  This  happens  when  there  are  no  tasks  eligible  to  run  in  other  pools  
  • 41. Minimum Resources •  A  pool  with  minimum  resources  defined  receives  priority   during  resource  allocaPon   •  The  minimum  resources,  minResources,  are  the  minimum   amount  of  resources  that  must  be  allocated  to  the  pool  prior   to  fair  share  allocaPon   –  Minimum  resources  are  allocated  to  each  pool  assuming  there  is   cluster  capacity   –  Pools  that  have  minimum  resources  specified  will  receive  priority  in   resource  assignment  
  • 42. Minimum Resource Allocation Example •  First,  fill  up  the  Production  pool  to  the  20GB  minimum   guarantee   •  Then  distribute  the  remaining  10GB  evenly  across  Alice  and   Bob Total:  30GB ProducPon BobAlice Demand:                100GB                              Demand:  30GB    Demand:  25GB minResources:  20GB 5GB 5GB 20GB
  • 43. Minimum Resource Allocation Example 2: Production Pool Empty •  Production  has  no  demand,  so  no  resources  are  allocated   to  it   •  All  resources  are  allocated  evenly  between  Alice  and  Bob Total:  30GB ProducPon 15GB BobAlice Demand:                        0GB                              Demand:  30GB    Demand:  25GB minResources:  20GB 15GB
  • 44. •  Combined  minResources  of  Production  and  Research   exceed  capacity   •  Minimum  resources  are  assigned  proporPonally  based  on   defined  minResources  unPl  available  resources  are  exhausted   •  No  memory  remains  for  pools  without  minResources  defined   (i.e.,Bob)   Minimum Resource Allocation Example 3: MinResources Exceed Resources Total:  30GB ProducPon BobResearch Demand:                      100GB                          Demand:  30GB      Demand:  25GB minResources:        50GB                  minResources:  25GB 20GB 10GB
  • 45. •  Production  is  filled  to  minResources   •  Remaining  25GB  is  distributed  across  all  pools   •  Production  pool  receives  more  than  its  minResources,  to   maintain  fairness   Minimum Resource Allocation Example 4: MinResources < Fair Share Total:  30GB ProducPon BobAlice Demand:                      100GB                          Demand:  30GB      Demand:  25GB minResources:        5GB 10GB 10GB10GB
  • 46. Pools with Weights •  Instead  of  (or  in  addiPon  to)  serng  minResources,  pools  can   be  assigned  a  weight   •  Pools  with  higher  weight  receive  more  resources  during   allocaPon   •  ‘Even  water  glass  height’  analogy:   –  Think  of  the  weight  as  controlling  the  ‘width’  of  the  glass  
  • 47. Example: Pool with Double Weight •  Production  is  filled  to  minResources  (5Gb)   •  Remaining  25GB  is  distributed  across  all  pools   •  Bob  pool  receives  twice  the  amount  of  memory  during  fair   share  allocaPon   Total:  30GB ProducPon BobAlice Demand:                      100GB        Demand:  30GB                                                  Demand:  25GB minResources:        5GB                            Weight:  2 8GB 14GB8GB
  • 48. Dominant Resource Fairness •  The  Fair  Scheduler  can  be  configured  to  schedule  with  both   memory  and  CPU  using  dominant  resource  fairness   •  Scenario  #1:   –  Alice  has  6GB  and  3  cores,  and  Bob  has  4GB  and  2  cores  –  which   pool  receives  the  next  resource  allocaPon?   •  Bob  will  receive  the  next  container  because  it  has  less   memory  and  less  CPU  cores  allocated  than  Alice   6GB 3  cores 4GB 2  cores Alice  Usage Bob  Usage
  • 49. Dominant Resource Fairness Example •  Scenario  #2:   –  A  cluster  has  10GB  of  total  memory  and  20  cores   –  Pool  Alice  has  containers  granted  for  4GB  of  memory  and  5  cores   –  Pool  Bob  has  containers  granted  for  1GB  of  memory  and  10  cores   •  Alice  will  receive  the  next  container  because  its  40%   dominant  share  of  memory  is  less  than  the  Bob  pool’s  50%   dominant  share  of  CPU     4GB   40%   capacity 5  cores   25%   capacity 1GB   10%   capacity 10  cores   50%   capacity Alice  Usage Bob  Usage
  • 50. Achieving Fair Share: The Patient Approach •  If  shares  are  imbalanced,  pools  which  are  over  their  fair  share   may  not  assign  new  tasks  when  their  old  ones  complete   –  Those  resources  then  become  available  to  pools  which  are  operaPng   below  their  fair  share   •  However,  waiPng  paPently  for  a  task  in  another  pool  to  finish   may  not  be  acceptable  in  a  producPon  environment   –  Tasks  could  take  a  long  Pme  to  complete  
  • 51. Achieving Fair Share: The Brute Force Approach •  With  preempPon  enabled,  the  Fair  Scheduler  acPvely  kills   tasks  that  belong  to  pools  operaPng  over  their  fair  share   –  Pools  operaPng  below  fair  share  receive  those  reaped  resources   •  There  are  two  types  of  preempPon  available   –  Minimum  share  preempPon   –  Fair  share  preempPon   •  PreempPon  code  avoids  killing  a  task  in  a  pool  if  it  would   cause  that  pool  to  begin  preempPng  tasks  in  other  pools   –  This  prevents  a  potenPally  endless  cycle  of  pools  killing  one  another’s   tasks  
  • 52. Minimum Share Preemption •  Pools  with  a  minResources  configured  are  operaPng  on  an   SLA  (Service  Level  Agreement)   •  Pools  that  are  below  their  minimum  share  as  defined  by   minResources  can  preempt  tasks  in  other  pools   –  Set  minSharePreemptionTimeout  to  the  number  of  seconds   the  pool  is  under  its  minimum  share  before  preempPon  should  begin   –  Default  is  infinite  (Java’s  Long.MAX_VALUE)  
  • 53. Fair Share Preemption •  Pools  not  receiving  their  fair  share  can  preempt  tasks  in  other   pools   –  Only  pools  that  exceed  their  fair  share  are  candidates  for  preempPon   •  Use  fair  share  preempPon  conservaPvely   –  Set  fairSharePreemptionTimeout  to  the  number  of  seconds  a   pool  is  under  fair  share  before  preempPon  should  begin   –  Default  is  infinite  (Java’s  Long.MAX_VALUE)  
  • 54. Agenda •  Why  YARN?   •  YARN  Architecture  and  Concepts   •  Resources  &  Scheduling   –  Capacity  Scheduler   –  Fair  Scheduler   •  Configuring  the  Fair  Scheduler   •  Managing  Running  Jobs  
  • 55. Configuring Fair Scheduler Capabilities (1) •  yarn.scheduler.fair.allow-­‐undeclared-­‐pools  (yarn-­‐site.xml)   –  When  true,  new  pools  can  be  created  at  applicaPon  submission  Pme   or  by  the  user-­‐as-­‐default-­‐queue  property.  When  false,  submirng  to  a   pool  that  is  not  specified  in  the  fair-scheduler.xml file  causes   the  applicaPon  to  be  placed  in  the  “default”  pool.  Default:  true.   Ignored  if  a  pool  placement  policy  is  defined  in  the  fair- scheduler.xml file.   •  yarn.scheduler.fair.preempPon  (yarn-­‐site.xml)   –  Enables  preempPon  in  Fair  Scheduler.  Set  to  true  if  you  have  pools   that  must  operate  on  an  SLA.  Default:  false.   •  yarn.scheduler.fair.user-­‐as-­‐default-­‐queue  (yarn-­‐site.xml)   –  Send  jobs  to  pools  based  on  users’  names  instead  of  to  the  default   pool,  root.default.  Default:  true  
  • 56. Configuring Fair Scheduler Capabilities (2) •  yarn.scheduler.fair.locality.threshold.node      、   yarn.scheduler.fair.locality.threshold.rack  (yarn-­‐site.xml)   –  For  applicaPon  that  request  containers  on  parPcular  nodes  or  racks,   the  number  of  scheduling  opportuniPes  since  the  last  container   assignment  to  wait  before  accepPng  a  placement  on  another  node.   Expressed  as  a  float  between  0  and  1,  which,  as  a  fracPon  of  the   cluster  size,  is  the  number  of  scheduling  opportuniPes  to  pass  up..   Default:  1  (don’t  pass  up  any  scheduling  opportuniPes)       •  Example:  yarn.scheduler.fair.locality.threshold.node  =  0.02,   cluster  size  =  100  nodes.  At  most  2  scheduling  opportuniPes  can  be  skipped  when   preferred  placement  cannot  be  met.  
  • 57. Configuring Resource Allocation for Pools and Users (1) •  You  configure  Fair  Scheduler  pools  in  the  /etc/hadoop/ conf/fair-scheduler.xml  file   •  The  Fair  Scheduler  rereads  this  file  every  10  seconds   –  ResourceManager  restart  is  not  required  when  the  file  changes   •  The  fair-scheduler.xml file  must  contain  an   <allocations> element   •  Use  the  <queue> element  to  configure  resource  allocaPon   for  a  pool   •  Use  the  <user> element  to  configure  resource  allocaPon   for  a  user  across  mulPple  pools  
  • 58. Configuring Resource Allocation for Pools and Users (2) •  To  specify  resource  allocaPons,  use  the  <queue> or   <user> element  with  any  or  all  of  the  the  following   subelements   –  <minResources> •  The  minimum  resources  to  which  the  pool  is  enPtled   •  Format  is  x mb, y vcores •  Example:  10000mb, 5 vcores –  <maxResources> •  The  maximum  resources  to  which  the  pool  is  enPtled   •  Format  is  x mb, y vcores
  • 59. Configuring Resource Allocation for Pools and Users (3) •  AddiPonal  sub-­‐elements  of  <queue> or  <user> to  use   when  specifying  resource  allocaPons   –  <maxRunningApps> •  The  maximum  applicaPons  in  the  pool  that  can  be  run  concurrently   –  <weight> •  Used  for  non-­‐proporPonate  sharing  with  other  pools   •  The  default  is  1   –  <minSharePreemptionTimeout> •  Time  to  wait  before  pre-­‐empPng  tasks   –  <schedulingPolicy> •  SRF  for  single  resource  fairness  (the  default)   •  SRF  for  dominant  resource  fairness   •  FIFO  for  first-­‐in,  first-­‐out  
  • 60. fair-scheduler.xml Example (1) •  Allow  users  to  run  three  jobs,  but  allow  Bob  to  run  six  jobs   <?xml  version=“1.0”?>   <allocaPons>          <userMaxAppsDefault>3</userMaxAppsDefault>          <user  name=“bob”>                  <maxRunningApps>6</maxRunningApps>          </user>   </allocaPons>  
  • 61. fair-scheduler.xml Example (2) •  Add  a  fair  share  Pmeout   <?xml  version=“1.0”?>   <allocaPons>          <userMaxAppsDefault>3</userMaxAppsDefault>          <user  name=“bob”>                  <maxRunningApps>6</maxRunningApps>          </user>          <fairSharePreempPonTimeout>300</fairSharePreempPonTimeout>   </allocaPons>  
  • 62. fair-scheduler.xml Example (3) •  Define  the  producPon  pool  with  a  weight  of  2  and  a  resource   allocaPon  of  10000  MB  and  1  core   <?xml  version=“1.0”?>   <allocaPons>          <userMaxAppsDefault>3</userMaxAppsDefault>          <queue  name=“producPon”>                  <minResources>10000  mb,  1  vcores</minResources>                  <weight>2.0</weight>          </queue  >   </allocaPons>  
  • 63. fair-scheduler.xml Example (4) •  Add  an  SLA  to  the  producPon  pool   <?xml  version=“1.0”?>   <allocaPons>          <userMaxAppsDefault>3</userMaxAppsDefault>          <queue  name=“producPon”>                  <minResources>10000  mb,  1  vcores</minResources>                  <weight>2.0</weight>                  <minSharePreempPonTimeout>60</minSharePreempPonTimeout>          </queue  >   </allocaPons>  
  • 64. The Fair Scheduler User Interface •  h^p://<resource_manager_host>:8088/cluster/scheduler  
  • 65. Agenda •  Why  YARN?   •  YARN  Architecture  and  Concepts   •  Resources  &  Scheduling   –  Capacity  Scheduler   –  Fair  Scheduler   •  Configuring  the  Fair  Scheduler   •  Managing  Running  Jobs  
  • 66. Displaying Jobs •  To  view  jobs  currently  running  on  the  cluster   –  yarn application –list –  List  all  running  jobs,  including  the  applicaPon  ID  for  each   •  To  view  all  jobs  on  the  cluster,  including  completed  jobs   –  yarn application –list all •  To  display  the  status  of  an  individual  job   –  yarn application –status <application_ID> •  You  can  also  use  the  ResourceManager  Web  UI,  Hue,  Ambari,   Cloudera  Manager  to  display  jobs  
  • 67. Killing Jobs •  It  is  important  to  note  that  once  a  user  has  submi^ed  a  job,   they  can  not  stop  it  just  by  hirng  CTRL-­‐C  on  their  terminal   –  This  stops  job  output  appearing  on  the  user’s  console   –  The  job  is  sPll  running  on  the  cluster!   •  To  kill  a  job  running  on  the  cluster   –  yarn application –kill <application_ID> •  You  can  also  kill  job  from  Cloudera  Manager  
  • 68.