2. About us
Joep Rottinghuis
•
•
•
•
Engineering Manager Hadoop/HBase team @ Twitter
Follow me @joep
Jay Shenoy
•
•
•
•
•
Software Engineer @ Twitter
Hardware Engineer @ Twitter
Engineering Manager HW @ Twitter
Follow me @eecraft
HW & Hadoop teams @ Twitter, Many others
#HadoopSummit2013
@Twitter
2
3. Agenda
•
Scale of Hadoop Clusters
•
Single versus multiple clusters
•
Twitter Hadoop Architecture
Hardware investigations
•
Results
•
#HadoopSummit2013
@Twitter
3
4. Scale
•
Scaling limits
# Nodes
• JobTracker 10’s thousands of jobs per day; 10’s Ks concurrent
slots
•
•
•
•
Namenode 250-300 M objects in single namespace
Namenode @~100 GB heap -> full GC pauses
Shipping job jars to 1,000’s of nodes
#HadoopSummit2013
JobHistory server at a few 100’s K job history/conf files
@Twitter
4
5. When / why to split clusters ?
•
In principle preference for single cluster
•
Common logs, shared free space, reduced admin burden, more rack
diversity
•
Varying SLA’s
•
Workload diversity
•
•
•
•
Storage intensive
Processing (CPU / Disk IO) intensive
Network intensive
Data access
•
Hot, Warm, Cold
#HadoopSummit2013
@Twitter
5
8. Service criteria for hardware
•
•
•
Hadoop does not need live HDD swap
Twitter DC : No SLA on data nodes
Rack SLA : Only 1 rack down at any time in a cluster
#HadoopSummit2013
@Twitter
8
9. Baseline Hadoop Server (~ early 2012)
DIMM
DIMM
GbE
E56xx
PCH
NIC
Characteristics:
DIMM
• Standard 2U
HBA
• 20 servers / rack
DIMM
DIMM
E56xx
DIMM
Works for the general cluster,
but...
• Need more density for storage
• Potential IO bottlenecks
#HadoopSummit2013
server
Expander
• E5645 CPU
• Dual 6-core
• 72GB memory
• 12 x 2TB HDD
• 2 x 1 GbE
@Twitter
9
10. Hadoop Server: Possible evolution
DIMM
DIMM
DIMM
E5-26xx or
E5-24xx
GbE
NIC
10GbE ?
Characteristics:
DIMM
+ CPU performance
? 20 servers / rack
DIMM
DIMM
DIMM
E5-26xx or
E5-24xx
HBA
DIMM
Expander
16 x 2T?
16 x 3T?
24 x 3T?
•
Candidate for
DW
Can deploy into the general DW cluster, but...
• Too much CPU for storage intensive apps
• Server failure domain too large if we scale up
disks
#HadoopSummit2013
@Twitter
10
11. Rethinking hardware evolution
•
Debunking myths
• Bigger is always better
• One size fits all
•
Back to Hadoop Hardware Roots:
• Scale horizontally, not vertically
Twitter Hadoop Server - “THS”
#HadoopSummit2013
@Twitter
11
12. THS for backups
GbE
NIC
DIMM
+ IO Performance
E3-12xx
• Few fast cores
DIMM
SAS
HBA
Storage focus:
• Cost efficient (single socket, 3T
drives)
Characteristics:
PCH
• E3-1230 V2 CPU
• 16 GB memory
• 12 x 3 TB HDD
• SSD boot
• 2 x 1 GbE
• Less memory needed
#HadoopSummit2013
@Twitter
12
13. THS variant for Hadoop-Proc and HBase
NIC
DIMM
10GbE
Characteristics:
+ IO Performance
E3-12xx
DIMM
• Few fast cores
SAS
HBA
Processing / throughput focus:
• Cost efficient (single socket, 1T
drives)
PCH
• E3-1230 V2 CPU
• 32 GB memory
• 12 x 1 TB HDD
• SSD boot
• 1 x 10 GbE
• More disk and network IO per
socket
#HadoopSummit2013
@Twitter
13
14. THS for cold cluster
GbE
NIC
DIMM
• Disk Efficiency
• Some compute
E3-12xx
DIMM
SAS
HBA
Combination of previous 2 use cases:
Characteristics:
PCH
• E3-1230 V2 CPU
• 32 GB memory
• 12 x 3 TB HDD
• 2 x 1 GbE
• Space & power efficient
• Storage dense and some processing
capabilities
#HadoopSummit2013
@Twitter
14
15. Rack-level view
1G TOR
Baseline
1G TOR
1G TOR
10G TOR
1G TOR
1G TOR
Twitter Hadoop Server
Backups
Proc
Cold
~ 8 kW
~ 8 kW
~ 8 kW
~ 8 kW
CPU sockets; DRAM
40; 1440 GB
40; 640 GB
40; 1280 GB
40; 1280 GB
Spindles; TB raw
240; 480 TB
480; 1,440 TB
480; 480 TB
480; 1,440 TB
Uplink; Internal BW
20 ; 40 Gbps
20 ; 80 Gbps
40 ; 400 Gbps
20 ; 80 Gbps
Power
#HadoopSummit2013
@Twitter
15
19. Recap
•
At a certain scale it makes sense to split into multiple clusters
• For us: RT, PROC, DW, COLD, BACKUPS, TST, EXP
•
For large enough clusters, depending on use-case, it may be worth to choose
different HW configurations
#HadoopSummit2013
@Twitter
19