Your SlideShare is downloading. ×
0
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Hadoop Networking at Datasift
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop Networking at Datasift

2,467

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,467
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
32
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Hadoop Networking at DataSift"How I learned to stop worrying and love Arista Switches"
  • 2. About MeGareth LlewellynSenior Operations Engineer at DataSift@NetworkStringabout.me/GarethLlewellynblog.NetworksAreMadeOfString.co.uk
  • 3. Overview● What is DataSift● The DataSift platform in numbers● Our initial network design● Issues with the initial design● Considered designs● Implementation● Questions
  • 4. What is DataSift● Real time and historical curation and filtering of many sources e.g. Facebook, Twitter, YouTube etc● Augmentation of data e.g. demographic, link resolution● Real time streams via Web Sockets, HTTP POST / PUT, SFTP etc● Historical queries against data from as long ago as 2010
  • 5. Platform in Numbers; Servers● ~7k 2.13 - 2.8 Ghz Cores● ~8Tb RAM● ~2Pb Storage● ~380 amps peak draw● Heterogeneous mix of chassis; Intel SR2600URLXR, Dell R710s & DL380 Gen 7 / Gen8s
  • 6. Platform in Numbers; Traffic● Writes ○ ~300Mb/s inbound streams● Replication ○ Peaks of 24Gb/s● Map Reduce ○ Peaks of 70Gb/s● Exports
  • 7. Initial Network Design
  • 8. Buffers & Discards
  • 9. Moving through the Cisco portfolio● 2960 ○ 2.7 mpps ○ 32 Gbs● 3560 ○ 13.1 mpps ○ 32 Gbs● 3750 ○ 38.7 mpps ○ 32 Gbs● 4948 ○ 72 mpps ○ 96 Gbs
  • 10. Redesigning the Network● Uplink over subscription ○ Servers per cab ○ Gbit uplinks per server ○ Extensibility / Redundancy of uplinks● Redundancy of TOR / Core / Distribution ○ Power ○ Chassis ○ Management Controllers● Performance ○ Buffers ○ Head of line blocking● Extensibility / Scalability ○ Number of Hosts / Cabs supported ○ Backplane
  • 11. Redundancy: MeshPros:● Inter cab transit is truly cab to cab● OSPF reduces admin overhead● Cisco IOSCons:● OSPF licence adds cost and increases complexity● Uplink over subscription● More ports for routing than hosts
  • 12. Uplink Oversubscription: ChassisPros:● 720,000,000 pps● 80Gbs of inter blade transit● Cisco IOS● Dual Supervisors / PSUCons:● Still suffers Head of Line blocking● Only 2 PSUs● Overpopulated line cards increase failure impact● Chassis failure (unlikely) = disaster● And....
  • 13. Cables!
  • 14. Where next?If I have seen further it is by standing on theshoulders of Giants - Issac NewtonBenoit Sigoure’s presentation at a Hadoop usergroup in 2011
  • 15. Leaf and SpineArista 7050s & 7048s● 2x 52x port 10Gbit 7050 core switches● 12x 48x 1Gb / 4x 10Gb port TOR switches● /27 public subnet per rack● ECMP routes to all racks● Dual PSU with disparate PDU / Dist Board / UPS / Generator
  • 16. The CoreArista 7050● Each 7050 is a separate layer 2 network● SVIs on VLAN for Internet routing● Static routes● 1.2 Tbs throughput / 960 mpps forwarding
  • 17. Top of RackArista 7048● VLAN number = Cab Number● SVI consumes 1 IP from /27● Static ECMP routes to all other cabs● Minimum of 2x 10Gbit uplinks● 176Gbs throughput● 132 mpps forwarding
  • 18. QuestionsYes, were hiring ;)

×