Your SlideShare is downloading. ×
TCP Issues in DataCenter Networks
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

TCP Issues in DataCenter Networks


Published on

Talk given for Advanced Networking Protocols course. Selected 2 papers in the area of DataCenter TCP issues.

Talk given for Advanced Networking Protocols course. Selected 2 papers in the area of DataCenter TCP issues.

Published in: Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. TCP Issues in Virtualized DatacenterNetworksHemanth Kumar MantriDepartment of Computer Science 1 of 27
  • 2. Selected Papers• The TCP Outcast Problem: ExposingUnfairness in Data Center Networks.– NSDI’12• vSnoop: Improving TCP Throughput inVirtualizedEnvironments via Ack Offload.– ACM/IEEE SC, 20102 of 27
  • 3. Background and Motivation• Data center is a shared environment– Multi Tenancy• Virtualization: A key enabler of cloudcomputing– Amazon EC2• Resource sharing– CPU/Memory are strictly shared– Network sharing largely laissez-faire3 of 27
  • 4. Data Center Networks• Flows compete via TCP• Ideally, TCP should achieve true fairness– All flows get equal share of link capacity• In practice, TCP exhibits RTT-bias– Throughput is inversely proportional to RTT• 2 Major Issues– Unfairness (in general)– Low Throughput (in virtualized environments)4 of 27
  • 5. Datacenter Topology (Hierarchical)5 of 27
  • 6. Traffic Pattern: Many to One6 of 27
  • 7. Key Find: UnfairnessInverse RTT Bias?Low RTT = Low Throughput7 of 27
  • 8. Further InvestigationInstantaneous Average2-hop flow is consistently starved!!TCP Outcast Problem• Some Flows are ‘Outcast’ed and receive very lowthroughput compared to others• Almost an order of magnitude reduction in somecases8 of 27
  • 9. Experiments• Same RTTs• Same Hop Length• Unsynchronized Flows• Introduce Background Traffic• Vary Switch Buffer Size• Vary TCP– RENO, MP-TCP, BIC, Cubic + SACK• Unfairness Persists! 9 of 27
  • 10. ObservationFlow differential at input ports is the culprit! 10 of 27
  • 11. Vary #flows at competing bottle neckswitch11 of 27
  • 12. Reason: Port Blackout1. Packets are roughly same size2. Similar inter-arrival rates (Predictable Timing) 12 of 27
  • 13. Port Blackout• Can occur on any input port• Happens for small intervals of time• Has more catastrophic effect onthroughput of fewer flows!!– Experiments showed that “same number” ofpacket drops affect the throughput of fewerflows much more than if there were severalconcurrent flows.13 of 27
  • 14. Conditions for TCP Outcast14 of 27
  • 15. Solutions?• Stochastic Fair Queuing (SFQ)– Explicitly enforce fairness among flows– Expensive for commodity switches• Equal Length Routing– All flows are forced to go through Core– Better interleaving of packets, alleviate PB15 of 27
  • 16. • Multiple VMs hosted by one physical host• Multiple VMs sharing the same core– Flexibility, scalability, and economyVM ConsolidationHardwareVirtualization LayerVM 1 VM 3 VM 4VM 2Observation:VM consolidation negativelyimpacts network performance!16 of 27
  • 17. SenderHardwareVirtualization LayerInvestigating the ProblemServerVM 1 VM 2 VM 3Client17 of 27
  • 18. 4060801001201401601805432RTT(ms)Number of VMsRTT increases inproportion to VMscheduling slice(30ms)Effect of CPU Sharing18 of 27
  • 19. Exact CulpritSenderHardwareDriver Domain(dom0)VM 1DeviceDriverVM 3bufbufVM 2buf19 of 27
  • 20. Connection to the VM is muchslower than dom0!Impact on TCP Throughput+ dom0x VM20 of 27
  • 21. Solution: vSnoop• Alleviates the negative effect of VM scheduling onTCP throughput• Implemented within the driver domain toaccelerate TCP connections• Does not require any modifications to the VM• Does not violate end-to-end TCP semantics• Applicable across a wide range of VMMs– Xen, VMware, KVM, etc.21 of 27
  • 22. Sender VM1 BufferDriver DomainTimeSYNSYN,ACKSYNSYN,ACKVM1 bufferTCP Connection to a VMScheduled VMVM1VM2VM3VM1VM2VM3SYN,ACKSYNVM SchedulingLatencyRTTRTTVM SchedulingLatencySender establishes a TCPconnection to VM122 of 27
  • 23. Sender VM Shared BufferDriver DomainTimeSYNSYN,ACKSYNSYN,ACKVM1 bufferKey Idea: Acknowledgement OffloadScheduled VMVM1VM2VM3VM1VM2VM3SYN,ACKw/ vSnoopFaster progress duringTCP slowstart23 of 27
  • 24. • Challenge 1: Out-of-order/special packets (SYN, FIN packets)• Solution: Let the VM handle these packets• Challenge 2: Packet loss after vSnoop• Solution: Let vSnoop acknowledge only if room in buffer• Challenge 3: ACKs generated by the VM• Solution: Suppress/rewrite ACKs already generated by vSnoopChallenges24 of 27
  • 25. vSnoop Implementation in XenDriver Domain (dom0)BridgeNetfrontNetbackvSnoopVM1NetfrontNetbackVM3NetfrontNetbackVM2buf bufbufTuningNetfront25 of 27
  • 26. Median0.192MB/s0.778MB/s6.003MB/sTCP Throughput Improvement• 3 VMs consolidated, 1000 transfers of a 100KB file• Vanilla Xen, Xen+tuning, Xen+tuning+vSnoop30x Improvement+ Vanilla Xenx Xen+tuning* Xen+tuning+vSnoop26 of 27
  • 27. Thank You!• References––• Most animations and pictures are taken fromthe authors’ original slides and NSDI’12conference talk.27 of 27
  • 29. Conditions for Outcast• Switches use the tail-drop queuemanagement discipline• A large set of flows and a small set offlows arriving at two different input portscompete for a bottleneck output port at aswitch29
  • 30. Why does Unfairness Matter?• Multi Tenant Clouds– Some tenants get better performance thanothers• Map Reduce Apps– Straggler problems– One delayed flow affects overall jobcompletion30
  • 31. State Machine Maintained Per-FlowStartUnexpectedSequenceActive(online)No buffer(offline)Out-of-orderpacketIn-order pktBuffer space availableOut-of-orderpacketIn-order pktNo bufferIn-order pktBuffer space availableNo bufferPacket recvEarly acknowledgementsfor in-order packetsDon’tacknowledgePass out-of-orderpkts to VM31
  • 32. vSnoop’s Impact on TCP Flows• Slow Start– Early acknowledgements help progressconnections faster– Most significant benefit for short transfers that aremore prevalent in data centers• Congestion Avoidance and Fast Retransmit– Large flows in the steady state can also benefitfrom vSnoop– Benefit not as much as for Slow Start 32