1. Hadoop Networking at DataSift"How I learned to stop worrying and love Arista Switches"
2. About MeGareth LlewellynSenior Operations Engineer at DataSift@NetworkStringabout.me/GarethLlewellynblog.NetworksAreMadeOfString.co.uk
3. Overview● What is DataSift● The DataSift platform in numbers● Our initial network design● Issues with the initial design● Considered designs● Implementation● Questions
4. What is DataSift● Real time and historical curation and filtering of many sources e.g. Facebook, Twitter, YouTube etc● Augmentation of data e.g. demographic, link resolution● Real time streams via Web Sockets, HTTP POST / PUT, SFTP etc● Historical queries against data from as long ago as 2010
5. Platform in Numbers; Servers● ~7k 2.13 - 2.8 Ghz Cores● ~8Tb RAM● ~2Pb Storage● ~380 amps peak draw● Heterogeneous mix of chassis; Intel SR2600URLXR, Dell R710s & DL380 Gen 7 / Gen8s
6. Platform in Numbers; Traffic● Writes ○ ~300Mb/s inbound streams● Replication ○ Peaks of 24Gb/s● Map Reduce ○ Peaks of 70Gb/s● Exports
10. Redesigning the Network● Uplink over subscription ○ Servers per cab ○ Gbit uplinks per server ○ Extensibility / Redundancy of uplinks● Redundancy of TOR / Core / Distribution ○ Power ○ Chassis ○ Management Controllers● Performance ○ Buffers ○ Head of line blocking● Extensibility / Scalability ○ Number of Hosts / Cabs supported ○ Backplane
11. Redundancy: MeshPros:● Inter cab transit is truly cab to cab● OSPF reduces admin overhead● Cisco IOSCons:● OSPF licence adds cost and increases complexity● Uplink over subscription● More ports for routing than hosts
12. Uplink Oversubscription: ChassisPros:● 720,000,000 pps● 80Gbs of inter blade transit● Cisco IOS● Dual Supervisors / PSUCons:● Still suffers Head of Line blocking● Only 2 PSUs● Overpopulated line cards increase failure impact● Chassis failure (unlikely) = disaster● And....
14. Where next?If I have seen further it is by standing on theshoulders of Giants - Issac NewtonBenoit Sigoure’s presentation at a Hadoop usergroup in 2011
15. Leaf and SpineArista 7050s & 7048s● 2x 52x port 10Gbit 7050 core switches● 12x 48x 1Gb / 4x 10Gb port TOR switches● /27 public subnet per rack● ECMP routes to all racks● Dual PSU with disparate PDU / Dist Board / UPS / Generator
16. The CoreArista 7050● Each 7050 is a separate layer 2 network● SVIs on VLAN for Internet routing● Static routes● 1.2 Tbs throughput / 960 mpps forwarding
17. Top of RackArista 7048● VLAN number = Cab Number● SVI consumes 1 IP from /27● Static ECMP routes to all other cabs● Minimum of 2x 10Gbit uplinks● 176Gbs throughput● 132 mpps forwarding