Hadoop Operations at LinkedIn

Grid Operations

Hadoop Operations at LinkedIn
Allen Wittenauer
Grid Computing Architect

©2013 LinkedIn Corporation. All Rights Reserved.

Wednesday, March 20, 13

“Hadoop is not a developer problem;
it’s an operations problem.”
-- Hadoop vendor ex-employee



§ August 2009
– 20 Nodes in 1 grid
– Apache Hadoop 0.20.0
– No configuration management
– No monitoring
– No security
– Free for all, including random mafia hits on running jobs
– FIFO Scheduling
– ~20 users
– 20 tasks per node
– Solaris

– No operational support

©2013 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS


How We Fixed This
(In Chronological Order)



Year One



§ Dropped task count
– 10 mappers => 7 mappers
– 10 reducers => 5 reducers

§ Reworked ETL
– hourlies => dailies
– Re-ordered to take advantage of compression
§ 10x storage improvement
– Sample impact on one job (not workflow!):
§ 80,000 map tasks => 2,000 map tasks
§ Run time cut in half

§ Optimize work flows/culture shift
§ More task time, less tasks
§ Production review to reinforce good behavio(u)r



§ Switched to Capacity Scheduler 5% ETL Tasks
– FIFO is terrible 15% Fast Queue:
– Fair Share only viable for small tasks - Task Time < 15 Minutes
- Job Time < 1 Hour
– Enforced SLAs via custom patch
- Slot stealing from "Slow" Queue

§ Submitted Jar Size Limit
80% Slow Queue:
– Encourage distributed cache usage - Job Time < 24 Hours
– Enforced limit via custom patch - Up to 80% of slots



§ Benchmarking
– Use production code not TeraSort!

Old Node: New Node:
- 2 Rack Units - 1 Rack Unit
- 2 CPUs - 2 CPUs
- 16 GB - 24 or 32 GB
- 8 x 1 TB SATA - 6 x 2 TB SATA
- 1 x 2 gb NIC - 1 x 1 gb NIC

§ Cut cost per unit in half
§ 2x nodes per rack
§ Extra RAM
– buffering
– bus speed



Year Two



§ DataNode disk partitioning
– Separate file systems for different purposes

20 GB 200 GB
HDFS
/, ... MR

...

5GB 200 GB
HDFS
Swap MR

– Mount options: noatime, commit=30, data=writeback

§ NN, JT, etc
– No “special hardware” == use SW RAID



LDAP Master Multi
LDAP Master
+ Master +
Replication
KDC Master KDC

LDAP/KDC LDAP/KDC
Slaves Slaves

username, uid username, uid
group name, gid group name, gid
netgroup, sudoers netgroup, sudoers

nscd nscd

Client Node Client Node



Host bcfg2 Server
Group1,
Group2,
... Group1 -> Svc1, Svc2, ...
bcfg2
Group2 -> Svc1, Svc3, ...
client Svc1+
Group3 -> Svc4, Svc5, ...
Svc2+
Svc3
Content

§ Service Bundle
– RPMs, config files, etc
– Conflict resolution



§ Different RPM names + different install locations = pre-deploy-ability:

Object RPM Name File Path

Hadoop 1.0.4-p3 Binaries hadoop-1043-bin-1.0.4-3 /dir/hadoop-1.0.4-p3

Grid Config for 1.0.4-p3 gridname-1043- /dir/grid-conf-1.0.4-p3
hadoopconf-1.0.4.3-1
Hadoop 1.1.2-p1 Binaries hadoop-1121-bin-1.1.2.1-1 /dir/hadoop-1.1.2-p1

Grid Config for 1.1.2-p1 gridname-1043- /dir/grid-conf-1.1.2-p1
hadoopconf-1.0.4.3-1



Year Three+



Corp IT
Grid Realm
Active Directory krbtgt/GRID@CORP
@GRID
@CORP

Password
krbtgt/host@GRID
krbtgt/service@GRID

krbtgt/user@CORP Hadoop
krbtgt/GRID@CORP
Services



Many months moving to secure Apache Hadoop...



§ March 2013
– 5000 Nodes in ~10 grids
– Apache Hadoop 1.0.4 + custom patches
– Full configuration management
– Full monitoring
– Security
– Capacity scheduler with SLA
– ~700 users
– 12 tasks per node
– Linux

– Five dedicated operations staff members



Future Work



Is ‘pure Hadoop’ the right
tool for all of our workloads?



YARN PBS

H
D
F
S

C
E
P
H



§ More on LinkedIn Hadoop Performance:
– http://www.slideshare.net/allenwittenauer/2012-lihadoopperf

§ LinkedIn Data Analytics:
– http://data.linkedin.com/



Hadoop Operations at LinkedIn

More Related Content

What's hot

Similar to Hadoop Operations at LinkedIn

More from Allen Wittenauer

Recently uploaded

Hadoop Operations at LinkedIn

Editor's Notes