Improving Hadoop Cluster Performance via Linux Configuration

Improving
Hadoop

Cluster
Performance
via

Linux
Conﬁgura:on

DevIgni:on
2014
–
Dulles,
Virginia

Alex
Moundalexis
//
@technmsg

2
©
Cloudera,
Inc.
All
rights
reserved.

Tips

from
a
former
system
administrator

3
©
Cloudera,
Inc.
All
rights
reserved.

Click
to
edit
Master
:tle
style

CC
BY
2.0
/
Richard
Bumgardner

Been
there,
done
that.

4
©
Cloudera,
Inc.
All
rights
reserved.

Tips

from
a
former
system
administrator
ﬁeld
guy

5
©
Cloudera,
Inc.
All
rights
reserved.

Click
to
edit
Master
:tle
style

CC
BY
2.0
/
Alex
Moundalexis

Home
sweet
home.

6
©
Cloudera,
Inc.
All
rights
reserved.

Tips

Easy
steps
to
take…

7
©
Cloudera,
Inc.
All
rights
reserved.

Tips

Easy
steps
to
take…
that
most
people
don’t.

8
©
Cloudera,
Inc.
All
rights
reserved.

What
this
talk
isn’t
about

•  Deploying

• Puppet,
Chef,
Ansible,
homegrown
scripts,
intern
labor

•  Sizing
&
Tuning

• Depends
heavily
on
data
and
workload

•  Coding

• Unless
you
count
STDOUT
redirec:on

•  Algorithms

• I
suck
at
math,
but
we’ll
try
some
mul:plica:on
later

9
©
Cloudera,
Inc.
All
rights
reserved.

“The
answer
to
most
Hadoop

ques:ons
is…

10
©
Cloudera,
Inc.
All
rights
reserved.

“The
answer
to
most
Hadoop

ques:ons
is…

it
depends.”

11
©
Cloudera,
Inc.
All
rights
reserved.

“The
answer
to
most
Hadoop

ques:ons
is…

it
depends.”

(helpful,
right?)

12
©
Cloudera,
Inc.
All
rights
reserved.

So
what
ARE
we
talking
about?

•  Seven
simple
things

• Quick

• Safe

• Viable
for
most
environments
and
use
cases

•  Iden:fy
issue,
then
oﬀer
solu:on

•  Note:
Commands
run
as
root
or
sudo

13
©
Cloudera,
Inc.
All
rights
reserved.

1.
Swapping

Bad
news,
best
not
to.

14
©
Cloudera,
Inc.
All
rights
reserved.

Swapping

•  A
form
of
memory
management

•  When
OS
runs
low
on
memory…

• write
blocks
to
disk

• use
now-‐free
memory
for
other
things

• read
blocks
back
into
memory
from
disk
when
needed

•  Also
known
as
paging

15
©
Cloudera,
Inc.
All
rights
reserved.

Swapping

•  Problem:
Disks
are
slow,
especially
to
seek

•  Hadoop
is
about
maximizing
IO

• spend
less
:me
acquiring
data

• operate
on
data
in
place

• large
streaming
reads/writes
from
disk

•  Memory
usage
is
somewhat
limited
within
JVM

• we
should
be
able
to
manage
our
memory

• account
for
JVM
overhead

16
©
Cloudera,
Inc.
All
rights
reserved.

Limit
swapping
in
kernel

•  Well,
as
much
as
possible.

•  Immediate:

#
echo
1
>
/proc/sys/vm/swappiness

•  Persist
amer
reboot:

#
echo
"vm.swappiness
=
1"
>>
/etc/sysctl.conf

17
©
Cloudera,
Inc.
All
rights
reserved.

Swapping
peculiari:es

•  Behavior
varies
based
on
Linux
kernel

• CentOS
6.4+
/
Ubuntu
10.10+

• For
you
kernel
gurus,
that’s
Linux
2.6.32-‐303+

•  Prior

• We
don’t
swap,
except
to
avoid
OOM
condi:on.

•  Amer

• We
don’t
swap,
ever.

•  Details:
hpp://:ny.cloudera.com/noswap

18
©
Cloudera,
Inc.
All
rights
reserved.

2.
File
Access
Time

Disable
this
too.

19
©
Cloudera,
Inc.
All
rights
reserved.

File
access
:me

•  Linux
tracks
access
:me

• writes
to
disk
even
if
all
you
did
was
read

•  Problem

• more
disk
seeks

• HDFS
is
write-‐once,
read-‐many

• NameNode
tracks
access
informa:on
for
HDFS

20
©
Cloudera,
Inc.
All
rights
reserved.

Don’t
track
access
:me

•  Mount
volumes
with
noatime
op:on

• In
/etc/fstab:

/dev/sdc
/data01
ext3
defaults,noatime
0

• Note:
noatime
assumes
nodirtime
as
well

•  What
about
relatime?

• Faster
than
atime
but
slower
than
noatime

•  No
reboot
required

• #
mount
-‐o
remount
/data01

21
©
Cloudera,
Inc.
All
rights
reserved.

3.
Root
Reserved
Space

Reclaim
it,
impress
your
bosses!

22
©
Cloudera,
Inc.
All
rights
reserved.

Root
reserved
space

•  EXT3/4
reserve
5%
of
disk
for
root-‐owned
ﬁles

• On
an
OS
disk,
sure

• System
logs,
kernel
panics,
etc

23
©
Cloudera,
Inc.
All
rights
reserved.

Click
to
edit
Master
:tle
style

CC
BY
2.0
/
Alex
Moundalexis

Disks
used
to
be
much
smaller,
right?

24
©
Cloudera,
Inc.
All
rights
reserved.

Do
the
math

•  Conserva:ve

• 5%
of
1
TB
disk
=
46
GB

• 5
data
disks
per
server
=
230
GB

• 5
servers
per
rack
=
1.15
TB

•  Quasi-‐Aggressive

• 5%
of
4
TB
disk
=
186
GB

• 12
data
disks
per
server
=
2.23
TB

• 18
servers
per
rack
=
40.1
TB

•  That’s
a
LOT
of
unused
storage!

25
©
Cloudera,
Inc.
All
rights
reserved.

Root
reserved
space

•  On
a
Hadoop
data
disk,
no
root-‐owned
ﬁles

•  When
crea:ng
a
par::on

#
mkfs.ext3
–m
0
/dev/sdc

•  On
exis:ng
par::ons

#
tune2fs
-‐m
0
/dev/sdc

• 0
is
safe,
1
is
for
the
ultra-‐paranoid

26
©
Cloudera,
Inc.
All
rights
reserved.

4.
Name
Service
Cache

Turn
it
on,
already!

27
©
Cloudera,
Inc.
All
rights
reserved.

Name
Service
Cache
Daemon

•  Daemon
that
caches
name
service
requests

• Passwords

• Groups

• Hosts

•  Helps
weather
network
hiccups

•  Helps
more
with
high
latency
LDAP,
NIS,
NIS+

•  Small
footprint

•  Zero
conﬁgura:on
required

28
©
Cloudera,
Inc.
All
rights
reserved.

Name
Service
Cache
Daemon

•  Hadoop
nodes

• largely
a
network-‐based
applica:on

• on
the
network
constantly

• issue
lots
of
name
lookups,
especially
HBase
&
distcp

• can
thrash
name
servers

•  Reducing
latency
of
service
requests?
Smart.

•  Reducing
impact
on
shared
infrastructure?
Smart.

29
©
Cloudera,
Inc.
All
rights
reserved.

Name
Service
Cache
Daemon

•  Turn
it
on,
let
it
work,
leave
it
alone:

#
chkconfig
-‐-‐level
345
nscd
on

#
service
nscd
start

•  Check
on
it
later:

#
nscd
-‐g

•  Unless
using
Red
Hat
SSSD;
modify
nscd
conﬁg
ﬁrst!

• Don’t
use
nscd
to
cache
passwd,
group,
or
netgroup

• Red
Hat,
Using
NSCD
with
SSSD.
hpp://goo.gl/68HTMQ

30
©
Cloudera,
Inc.
All
rights
reserved.

5.
File
Handle
Limits

Not
a
problem,
un:l
they
are.

31
©
Cloudera,
Inc.
All
rights
reserved.

File
handle
limits

•  Kernel
refers
to
ﬁles
via
a
handle

• Also
called
descriptors

•  Linux
is
a
mul:-‐user
system

•  File
handles
protect
the
system
from

• Poor
coding

• Malicious
users

• Poor
coding
of
malicious
users

• Pictures
of
cats
on
the
Internet

32
©
Cloudera,
Inc.
All
rights
reserved.
32

Microsom
Oﬃce
EULA.
Really.

java.io.FileNotFoundExcep:on:
(Too
many
open
ﬁles)

33
©
Cloudera,
Inc.
All
rights
reserved.

File
handle
limits

•  Linux
defaults
usually
not
enough

•  Increase
maximum
open
ﬁles
(default
1024)

#
echo
hdfs
–
nofile
32768
>>
/etc/security/limits.conf

#
echo
mapred
–
nofile
32768
>>

#
echo
hbase
–
nofile
32768
>>

•  Bonus:
Increase
maximum
processes
too

#
echo
hdfs
–
nproc
32768
>>

#
echo
mapred
–
nproc
32768
>>

#
echo
hbase
–
nproc
32768
>>

•  Note:
Cloudera
Manager
will
do
this
for
you.

34
©
Cloudera,
Inc.
All
rights
reserved.

6.
Dedicated
Disks

Don’t
be
tempted
to
share,
even
with
monster
disks.

35
©
Cloudera,
Inc.
All
rights
reserved.

The
Situa:on

1.  Your
new
server
has
a
dozen
1
TB
disks

2.  Eleven
disks
are
used
to
store
data

3.  One
disk
is
used
for
the
OS

• 20
GB
for
the
OS

• 980
GB
sits
unused

4.  Someone
asks
“can
we
store
data
there
too?”

5.  Seems
reasonable,
lots
of
space…
“OK,
why
not.”

Sound
familiar?

36
©
Cloudera,
Inc.
All
rights
reserved.

Microsom
Oﬃce
EULA.
Really.

“I
don’t
understand
it,
there’s

no
consistency
to
these
run
>mes!”

37
©
Cloudera,
Inc.
All
rights
reserved.

No
love
for
shared
disk

•  Our
quest
for
data
gets
interrupted
a
lot:

• OS
opera:ons

• OS
logs

• Hadoop
logging,
quite
chapy

• Hadoop
execu:on

• userspace
execu:on

•  Disk
seeks
are
slow,
remember?

38
©
Cloudera,
Inc.
All
rights
reserved.

Dedicated
disk
for
OS
and
logs

•  At
install
:me

• Disk
0,
OS
&
logs

• Disk
1-‐n,
Hadoop
data

•  Amer
install,
more
complicated
eﬀort,
requires
manual
HDFS
block
rebalancing:

1.  Take
down
HDFS

•  If
you
can
do
it
in
under
10
minutes,
just
the
DataNode

2.  Move
or
distribute
blocks
from
disk0/dir
to
disk[1-‐n]/dir

3.  Remove
dir
from
HDFS
conﬁg
(dfs.data.dir)

4.  Start
HDFS

39
©
Cloudera,
Inc.
All
rights
reserved.

7.
Name
Resolu:on

Sane,
both
forward
and
reverse.

40
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
op:ons

1.  Hosts
ﬁle,
if
you
must

2.  DNS,
much
preferred

41
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
with
hosts
ﬁle

•  Set
canonical
names
properly

•  Right

10.1.1.1

r01m01.cluster.org
r01m01
master1

10.1.1.2

r01w01.cluster.org

r01w01
worker1

•  Wrong

10.1.1.1

r01m01

r01m01.cluster.org
master1

10.1.1.2

r01w01

r01w01.cluster.org
worker1

42
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
with
hosts
ﬁle

•  Set
loopback
address
properly

• Ensure
127.0.0.1
resolves
to
“localhost,”
NOT
hostname

•  Right

127.0.0.1
localhost

•  Wrong

127.0.0.1
r01m01

45
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
errata

•  Mismatches?
Expect
odd
results.

• Problems
star:ng
DataNodes

• Non-‐FQDN
in
Web
UI
links

• Security
features
are
extra
sensi:ve
to
FQDN

•  Errors
so
common
that
link
to
FAQ
is
included
in
logs!

• hpp://wiki.apache.org/hadoop/UnknownHost

•  Get
name
resolu:on
working
BEFORE
enabling
nscd!

49
©
Cloudera,
Inc.
All
rights
reserved.

Summary

1.  disable
vm.swappiness

2.  data
disks:
mount
with
noatime
op:on

3.  data
disks:
disable
root
reserve
space

4.  enable
nscd

5.  increase
file
handle
limits

6.  use
dedicated
OS/logging
disk

7.  sane
name
resolu:on

hpp://:ny.cloudera.com/7steps

54
©
Cloudera,
Inc.
All
rights
reserved.

Other
things
to
check

•  Disk
IO

• hdparm

•  #
hdparm
-‐Tt
/dev/sdc

•  Looking
for
at
least
70
MB/s
from
7200
RPM
disks

•  Slower
could
indicate
a
failing
drive,
disk
controller,
array,
etc.

• dd

•  hpp://romanrm.ru/en/dd-‐benchmark

55
©
Cloudera,
Inc.
All
rights
reserved.

Other
things
to
check

•  Disable
Red
Hat
Transparent
Huge
Pages
(RH6+
un:l
6.5)

• Can
reduce
elevated
CPU
usage

• In
rc.local:

echo
never
>
/sys/kernel/mm/redhat_transparent_hugepage/defrag

echo
never
>
/sys/kernel/mm/redhat_transparent_hugepage/enabled

• Reference:
Linux
6
Transparent
Huge
Pages
and
Hadoop
Workloads,
hpp://
goo.gl/WSF2qC

57
©
Cloudera,
Inc.
All
rights
reserved.

Other
things
to
check

•  Enable
Jumbo
Frames

• Only
if
your
network
infrastructure
supports
it!

• Can
easily
(and
arguably)
boost
throughput
by
10-‐20%

•  Monitor
and
Chart
Everything

• How
else
will
you
know
what’s
happening?

• Nagios

• Ganglia

Improving Hadoop Cluster Performance via Linux Configuration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Improving Hadoop Cluster Performance via Linux Configuration

Similar to Improving Hadoop Cluster Performance via Linux Configuration (20)

Recently uploaded

Recently uploaded (20)

Improving Hadoop Cluster Performance via Linux Configuration