Your SlideShare is downloading. ×
0
Hadoop Hardware @Twitter:
Size does matter.
@joep and @eecraft
Hadoop Summit 2013

v2.3
About us
Joep Rottinghuis

•

•
•
•

Engineering Manager Hadoop/HBase team @ Twitter
Follow me @joep

Jay Shenoy

•

•
•
•...
Agenda
•

Scale of Hadoop Clusters

•

Single versus multiple clusters

•

Twitter Hadoop Architecture
Hardware investigat...
Scale

•

Scaling limits

# Nodes

• JobTracker 10’s thousands of jobs per day; 10’s Ks concurrent
slots

•
•
•
•

Namenod...
When / why to split clusters ?
•

In principle preference for single cluster

•

Common logs, shared free space, reduced a...
Cluster Architecture

#HadoopSummit2013

@Twitter

6
Hardware investigations

#HadoopSummit2013

@Twitter

7
Service criteria for hardware
•

•

•

Hadoop does not need live HDD swap
Twitter DC : No SLA on data nodes
Rack SLA : Onl...
Baseline Hadoop Server (~ early 2012)
DIMM
DIMM

GbE

E56xx

PCH

NIC

Characteristics:

DIMM

• Standard 2U
HBA

• 20 ser...
Hadoop Server: Possible evolution
DIMM
DIMM
DIMM

E5-26xx or
E5-24xx

GbE

NIC

10GbE ?

Characteristics:

DIMM

+ CPU per...
Rethinking hardware evolution
•

Debunking myths

• Bigger is always better
• One size fits all
•

Back to Hadoop Hardware...
THS for backups
GbE

NIC

DIMM

+ IO Performance

E3-12xx

• Few fast cores

DIMM

SAS
HBA

Storage focus:

• Cost efficie...
THS variant for Hadoop-Proc and HBase
NIC

DIMM

10GbE

Characteristics:
+ IO Performance

E3-12xx
DIMM

• Few fast cores
...
THS for cold cluster
GbE

NIC

DIMM

• Disk Efficiency
• Some compute

E3-12xx
DIMM

SAS
HBA

Combination of previous 2 us...
Rack-level view

1G TOR

Baseline

1G TOR
1G TOR

10G TOR

1G TOR
1G TOR

Twitter Hadoop Server
Backups

Proc

Cold

~ 8 k...
Processing performance comparison
Benchmark

Baseline Server

THS (-Cold)

TestDFSIO (write replication = 1)

360 MB/s / n...
Results

#HadoopSummit2013

@Twitter

17
LZO performance comparison

#HadoopSummit2013

@Twitter

18
16
Recap
•

At a certain scale it makes sense to split into multiple clusters

• For us: RT, PROC, DW, COLD, BACKUPS, TST, EX...
Conclusion

@Twitter our “Twitter Hadoop Server”
not only saves many $$$, it is also
faster !

#HadoopSummit2013

@Twitter...
#ThankYou
@joep and @eecraft
Come talk to us at booth 26
Upcoming SlideShare
Loading in...5
×

Hadoop Hardware @Twitter: Size does matter.

363

Published on

@joep and @eecraft
Hadoop Summit 2013

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
363
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Hadoop Hardware @Twitter: Size does matter."

  1. 1. Hadoop Hardware @Twitter: Size does matter. @joep and @eecraft Hadoop Summit 2013 v2.3
  2. 2. About us Joep Rottinghuis • • • • Engineering Manager Hadoop/HBase team @ Twitter Follow me @joep Jay Shenoy • • • • • Software Engineer @ Twitter Hardware Engineer @ Twitter Engineering Manager HW @ Twitter Follow me @eecraft HW & Hadoop teams @ Twitter, Many others #HadoopSummit2013 @Twitter 2
  3. 3. Agenda • Scale of Hadoop Clusters • Single versus multiple clusters • Twitter Hadoop Architecture Hardware investigations • Results • #HadoopSummit2013 @Twitter 3
  4. 4. Scale • Scaling limits # Nodes • JobTracker 10’s thousands of jobs per day; 10’s Ks concurrent slots • • • • Namenode 250-300 M objects in single namespace Namenode @~100 GB heap -> full GC pauses Shipping job jars to 1,000’s of nodes #HadoopSummit2013 JobHistory server at a few 100’s K job history/conf files @Twitter 4
  5. 5. When / why to split clusters ? • In principle preference for single cluster • Common logs, shared free space, reduced admin burden, more rack diversity • Varying SLA’s • Workload diversity • • • • Storage intensive Processing (CPU / Disk IO) intensive Network intensive Data access • Hot, Warm, Cold #HadoopSummit2013 @Twitter 5
  6. 6. Cluster Architecture #HadoopSummit2013 @Twitter 6
  7. 7. Hardware investigations #HadoopSummit2013 @Twitter 7
  8. 8. Service criteria for hardware • • • Hadoop does not need live HDD swap Twitter DC : No SLA on data nodes Rack SLA : Only 1 rack down at any time in a cluster #HadoopSummit2013 @Twitter 8
  9. 9. Baseline Hadoop Server (~ early 2012) DIMM DIMM GbE E56xx PCH NIC Characteristics: DIMM • Standard 2U HBA • 20 servers / rack DIMM DIMM E56xx DIMM Works for the general cluster, but... • Need more density for storage • Potential IO bottlenecks #HadoopSummit2013 server Expander • E5645 CPU • Dual 6-core • 72GB memory • 12 x 2TB HDD • 2 x 1 GbE @Twitter 9
  10. 10. Hadoop Server: Possible evolution DIMM DIMM DIMM E5-26xx or E5-24xx GbE NIC 10GbE ? Characteristics: DIMM + CPU performance ? 20 servers / rack DIMM DIMM DIMM E5-26xx or E5-24xx HBA DIMM Expander 16 x 2T? 16 x 3T? 24 x 3T? • Candidate for DW Can deploy into the general DW cluster, but... • Too much CPU for storage intensive apps • Server failure domain too large if we scale up disks #HadoopSummit2013 @Twitter 10
  11. 11. Rethinking hardware evolution • Debunking myths • Bigger is always better • One size fits all • Back to Hadoop Hardware Roots: • Scale horizontally, not vertically Twitter Hadoop Server - “THS” #HadoopSummit2013 @Twitter 11
  12. 12. THS for backups GbE NIC DIMM + IO Performance E3-12xx • Few fast cores DIMM SAS HBA Storage focus: • Cost efficient (single socket, 3T drives) Characteristics: PCH • E3-1230 V2 CPU • 16 GB memory • 12 x 3 TB HDD • SSD boot • 2 x 1 GbE • Less memory needed #HadoopSummit2013 @Twitter 12
  13. 13. THS variant for Hadoop-Proc and HBase NIC DIMM 10GbE Characteristics: + IO Performance E3-12xx DIMM • Few fast cores SAS HBA Processing / throughput focus: • Cost efficient (single socket, 1T drives) PCH • E3-1230 V2 CPU • 32 GB memory • 12 x 1 TB HDD • SSD boot • 1 x 10 GbE • More disk and network IO per socket #HadoopSummit2013 @Twitter 13
  14. 14. THS for cold cluster GbE NIC DIMM • Disk Efficiency • Some compute E3-12xx DIMM SAS HBA Combination of previous 2 use cases: Characteristics: PCH • E3-1230 V2 CPU • 32 GB memory • 12 x 3 TB HDD • 2 x 1 GbE • Space & power efficient • Storage dense and some processing capabilities #HadoopSummit2013 @Twitter 14
  15. 15. Rack-level view 1G TOR Baseline 1G TOR 1G TOR 10G TOR 1G TOR 1G TOR Twitter Hadoop Server Backups Proc Cold ~ 8 kW ~ 8 kW ~ 8 kW ~ 8 kW CPU sockets; DRAM 40; 1440 GB 40; 640 GB 40; 1280 GB 40; 1280 GB Spindles; TB raw 240; 480 TB 480; 1,440 TB 480; 480 TB 480; 1,440 TB Uplink; Internal BW 20 ; 40 Gbps 20 ; 80 Gbps 40 ; 400 Gbps 20 ; 80 Gbps Power #HadoopSummit2013 @Twitter 15
  16. 16. Processing performance comparison Benchmark Baseline Server THS (-Cold) TestDFSIO (write replication = 1) 360 MB/s / node 780 MB/s / node TeraGen (30TB replication = 3) 1:36 hrs 1:35 hrs TeraSort (30 TB, replication = 3) 6:11 hrs 4:22 hrs 2 Parallel TeraSort (30 TB each, replication = 3) 10:36 hrs 6:21 hrs Application #1 4:37 min 3:09 min Application set #2 13:3 hrs 10:57 hrs Performance benchmark set up: • Each clusters 102 nodes of respective type • Efficient server = 3 racks, Baseline 5+ racks • “Dated” stack: CentOS 5.5, Sun 1.6 JRE, Hadoop 2.0.3 #HadoopSummit2013 @Twitter 16
  17. 17. Results #HadoopSummit2013 @Twitter 17
  18. 18. LZO performance comparison #HadoopSummit2013 @Twitter 18 16
  19. 19. Recap • At a certain scale it makes sense to split into multiple clusters • For us: RT, PROC, DW, COLD, BACKUPS, TST, EXP • For large enough clusters, depending on use-case, it may be worth to choose different HW configurations #HadoopSummit2013 @Twitter 19
  20. 20. Conclusion @Twitter our “Twitter Hadoop Server” not only saves many $$$, it is also faster ! #HadoopSummit2013 @Twitter 20
  21. 21. #ThankYou @joep and @eecraft Come talk to us at booth 26
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×