Hadoop bangalore-meetup-dec-2011-hadoop nextgen

Hadoop
Nextgen/MRv2/YARN

Sharad
Agarwal

sharad@apache.org

About
me

•  Apache
FoundaAon

–  Hadoop
CommiDer
and
PMC
member

–  Hadoop
MR
contributor
~
4
years

–  Author
of
Hadoop
Nextgen
core

•  Head
of
Technology
PlaKorms
@InMobi

–  Formerly
Architect
@Yahoo!

Hadoop
Map-‐Reduce
Today

•  JobTracker

–  Manages
cluster

resources
and
job

scheduling

•  TaskTracker

–  Per-‐node
agent

–  Manage
tasks

Current
LimitaAons

•  Scalability

–  Maximum
Cluster
size
–
4,000
nodes

–  Maximum
concurrent
tasks
–
40,000

–  Coarse
synchronizaAon
in
JobTracker

•  Single
point
of
failure

–  Failure
kills
all
queued
and
running
jobs

–  Jobs
need
to
be
re-‐submiDed
by
users

•  Restart
is
very
tricky
due
to
complex
state

•  Hard
parAAon
of
resources
into
map
and

reduce
slots

Current
LimitaAons

•  Lacks
support
for
alternate
paradigms

–  IteraAve
applicaAons
implemented
using
Map-‐
Reduce
are
10x
slower.

–  Example:
K-‐Means,
PageRank

•  Lack
of
wire-‐compaAble
protocols

–  Client
and
cluster
must
be
of
same
version

–  ApplicaAons
and
workﬂows
cannot
migrate
to

diﬀerent
clusters

Next
GeneraAon
Map-‐Reduce

Requirements

•  Reliability

•  Availability

•  Scalability
-‐
Clusters
of
6,000
machines

–  Each
machine
with
16
cores,
48G
RAM,
24TB
disks

–  100,000
concurrent
tasks

–  10,000
concurrent
jobs

•  Wire
CompaAbility

•  Agility
&
EvoluAon
–
Ability
for
customers
to

control
upgrades
to
the
grid
sodware
stack.

Next
GeneraAon
Map-‐Reduce

Architecture

•  Split
up
the
two
major
funcAons
of
JobTracker

–  Cluster
resource
management

–  ApplicaAon
life-‐cycle
management

•  Map-‐Reduce
becomes
user-‐land
library

Architecture

Node
Node
Manager
Manager

Container App Mstr
App Mstr

Client

Resource
Resource Node
Node
Manager Manager
Manager
Manager
Client
Client

App Mstr Container
Container

MapReduce Status Node
Node
MapReduce Status Manager
Manager
Job Submission
Job Submission
Node Status
Node Status
Resource Request
Resource Request Container Container

Architecture

•  Resource
Manager

–  Global
resource
scheduler

–  Hierarchical
queues

•  Node
Manager

–  Per-‐machine
agent

–  Manages
the
life-‐cycle
of
container

–  Container
resource
monitoring

•  ApplicaAon
Master

–  Per-‐applicaAon

–  Manages
applicaAon
scheduling
and
task
execuAon

–  E.g.
Map-‐Reduce
ApplicaAon
Master

Improvements
vis-‐à-‐vis
current
Map-‐
Reduce

•  Scalability

–  ApplicaAon
life-‐cycle
management
is
very

expensive

–  ParAAon
resource
management
and

applicaAon
life-‐cycle
management

–  ApplicaAon
management
is
distributed

–  Hardware
trends
-‐
Currently
run
clusters
of

4,000
machines

•  6,000
2012
machines
>
12,000
2009
machines

•  <8
cores,
16G,
4TB>
v/s
<16+
cores,
48/96G,

24TB>

Improvements
vis-‐à-‐vis
current
Map-‐
Reduce

•  Availability

–  ApplicaAon
Master

•  OpAonal
failover
via
applicaAon-‐speciﬁc

checkpoint

•  Map-‐Reduce
applicaAons
pick
up
where
they

led
oﬀ

–  Resource
Manager

•  No
single
point
of
failure
-‐
failover
via

ZooKeeper

•  ApplicaAon
Masters
are
restarted

automaAcally

Improvements
vis-‐à-‐vis
current
Map-‐
Reduce

•  Wire
CompaAbility

–  Protocols
are
wire-‐compaAble

–  Old
clients
can
talk
to
new
servers

–  Rolling
upgrades

Improvements
vis-‐à-‐vis
current
Map-‐
Reduce

•  Agility
/
EvoluAon

–  Map-‐Reduce
now
becomes
a
user-‐land

library

–  MulAple
versions
of
Map-‐Reduce
can
run

in
the
same
cluster
(ala
Apache
Pig)

•  Faster
deployment
cycles
for
improvements

–  Customers
upgrade
Map-‐Reduce
versions

on
their
schedule

Improvements
vis-‐à-‐vis
current
Map-‐
Reduce

•  UAlizaAon

–  Generic
resource
model

•  Memory

•  CPU

•  Disk
b/w

•  Network
b/w

–  Remove
ﬁxed
parAAon
of
map
and
reduce

slots

Improvements
vis-‐à-‐vis
current
Map-‐
Reduce

•  Support
for
programming
paradigms

other
than
Map-‐Reduce

–  MPI

–  Master-‐Worker

–  Machine
Learning

–  IteraAve
processing

–  Enabled
by
allowing
use
of
paradigm-‐
speciﬁc
ApplicaAon
Master

–  Run
all
on
the
same
Hadoop
cluster

Summary

•  The
next
generaAon
of
Map-‐Reduce
takes

Hadoop
to
the
next
level

–  Scale-‐out
even
further

–  High
availability

–  Cluster
UAlizaAon

–  Support
for
paradigms
other
than
Map-‐Reduce

Status

•  Apache
Hadoop
0.23
release
is
out

–  HDFS
FederaAon

–  MRv2

•  Currently
undergoing
tests
on
Small
scale
~
500
nodes

•  Alpha

–  2000
nodes

–  Q1
2012

•  Beta/ProducAon

–  Variety
of
applicaAons
and
loads

–  4000+
nodes

–  Q2
2012

QuesAons?

Follow
me
on
@twiDer:
sharad_ag

Hadoop bangalore-meetup-dec-2011-hadoop nextgen

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to Hadoop bangalore-meetup-dec-2011-hadoop nextgen

Similar to Hadoop bangalore-meetup-dec-2011-hadoop nextgen (20)

More from InMobi

More from InMobi (20)

Recently uploaded

Recently uploaded (20)

Hadoop bangalore-meetup-dec-2011-hadoop nextgen