Your SlideShare is downloading. ×
Network and TCP performance relationship workshop
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Network and TCP performance relationship workshop

1,867
views

Published on

Slide in TWNIC 14th OPM TWNOG Workshop. Date: July 2, 2010.

Slide in TWNIC 14th OPM TWNOG Workshop. Date: July 2, 2010.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,867
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
52
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. TWNOG WORKSHOP 2010/7/2, Taipei 網路維運常見問題原因、偵錯 (Troubleshooting) 技術解析 網路與 TCP 效能關聯探討 智匯亞洲有限公司 許至凱 CCIE/JNCIE kaeatforum [at] gmail.com
  • 2. Objects
    • 對象:網路設備操作、維運人員
    • 了解有那些網路環境因子會對於 TCP 效能造成影響,以連結網路維運與網路應用程式效能,做為網路環境改善方式的參考。
      • 了解 TCP 運作原理
      • 那些網路事件發生時將影響 TCP 效能表現?
      • 因應對策
  • 3. Agenda
    • TCP Briefing
    • TCP Performance Factors
    • Network Event Impact
    • Improvement – Network approach
    • Improvement – Appliance approach
    • Reference
  • 4. TCP Briefing
    • TCP/IP stack in a computer system
      • Linux
    Application Socket Layer (net/socket.c) Inet Layer (net/ipv4/af_inte.c) IP Layer (various ip files in net/ipv4) TCP Layer (net/ipv4/tcp.c) UDP Layer (net/ipv4/udp.c) Ethernet Device Driver Ethernet Card Other Drivers Parallel/Serial/Other Interface Drivers
  • 5. TCP Briefing
    • TCP/IP stack in a computer system
      • Windows
    TCP/IP Stack (Tcpip.sys) Windows Sockets Applications Windows Sockets AFD WSK Clients WSK NetBT and other TDI clients TDI TDX TCP UDP RAW IPv6 IPv4 802.3 PPP 802.11 Loopback IPv4 Tunnel NDIS User Kernel
  • 6. TCP Briefing
    • TCP/IP position in computer and network environment
  • 7. TCP Briefing
    • TCP header format (RFC793)
  • 8. TCP Briefing
    • TCP header format (updated by RFC3168)
  • 9. TCP Performance Factors
    • TCP Performance Factors
      • Monitoring Tools
      • Flow control
      • Congestion control
  • 10. TCP Performance Factors
      • Measurement tools
        • Monitoring tools
          • tcpdump
            • On Windows platform - Wireshark
          • tcpstat
        • Benchmarking tools
          • ttcp
          • Netperf
          • NetPIPE
          • DBS (Distributed Benchmark System)
  • 11. TCP Performance Factors
      • Flow control
        • Sliding Window (window size = 6 in the example)
    Step 1 Step 2 Step 3 Step 4 Time 已收到 ACK 等待 ACK 中 可傳送區間 不可傳送區間 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0
  • 12. TCP Performance Factors
      • Flow control
        • Window Size Adjustment
          • “ Receiver window size filed” in TCP header
  • 13. TCP Performance Factors
      • Congestion Control
        • Flow control 讓接收端控制進入之流量,避免 buffer overflow 情況發生
          • 藉由 AdvertisedWindow 調整發送端 window size
          • 無法反應網路連線狀況
            • 無法避免所經網路是否有類似 buffer overflow 情況發生
        • 為能偵測可能的網路壅塞, TCP 使用 Congestion control 。
          • 藉由 CongestionWindow (cwnd) 來進行調整
        • Congestion control 主要含四種方式 (RFC5681) :
          • Slow start
          • Congestion avoidance
          • Fast retransmit
          • Fast recovery
  • 14. TCP Performance Factors
        • Slow start
          • TCP connection 剛建立時,使用小的 window size 。等到收到 ACK 後再慢慢增加。
            • cwnd 初始值為 1
            • 旨在偵測網路頻寬狀況
          • 每收到 1 個 ACK 則 cwnd+1
            • 如此一來,每經過一個 round-trip time (RTT) , cwnd 的值則變成上一次 RTT 的兩倍
            • 指數成長
          • 為避免 cwnd 增加太快,俟 cwnd 超過” slow start threshold, ssthresh” 後,每一 RTT 只增加 1
            • 線性成長
  • 15. TCP Performance Factors
        • Congestion avoidance
          • 在此階段 :
            • cwnd > ssthresh
            • cwnd + 1 for each RTT
          • 當有 packet loss 發生時,則 :
            • ssthresh -> cwnd/2
            • cwnd -> 1
            • packet retransmission
          • 一旦 packet loss 發生時, TCP Performance 將受到嚴重影響。
  • 16. TCP Performance Factors
        • Slow start & Congestion avoidance characteristic
  • 17. TCP Performance Factors
        • Fast retransmit (Tahoe)
          • 仍套用 slow start + congestion avoidance
          • sender 收到 3 個 duplicate ACK 後即重新傳送封包
            • 避免 sender timeout 後,因必須調整 ssthreh/cwnd 造成 TCP 效能嚴重下降
        • Fast recovery (Reno)
          • 先套用 fast retransmit
            • 收到 duplicate 封包後即進入 congestion avoidance
          • 再執行 fast recovery
            • ssthresh -> cwnd/2
            • 重送封包
            • cwnd -> ssthresh + 3
        • NewReno, SACK, Vegas…..
          • 都在 TCP 端進行效能改善
  • 18. Network Event Impact
    • Packet loss
      • By TCP congestion control, packet loss will launch TCP retransmission
        • 儘管 TCP congestion control 做的再好, packet loss 都會造成 TCP Performance downgrade
  • 19. Network Event Impact
    • Packet out-of-order
      • Packet out-of-order 時 , 雖然 TCP 能夠將封包組回 , 但若 TCP fast recovery 作用時反可能會造成資源浪費
        • Reno 在收到 duplicate ACK 後即會開始重送封包,直到收到 Partial ACK 後才停止。
          • 若 packet 只是慢點到而不是不到,則 sender 勢必會重傳不需要重傳的封包,造成資源浪費。
        • NewReno 為改善 Reno 的效率,會在收到 Final ACK 後才停止重傳遺失封包。
          • NewReno 會重覆送的封包數量有可能比 Reno 還多。
  • 20. Improvement – Network approach
    • Reduce packet loss
      • Packet loss 對 TCP Performance 影響很大,網路環境中所有 packet loss 都應儘量排除。
      • Layer 1, layer 2 error
        • Unqualified physical media
          • CRC, P3 error etc…
      • Layer 3
        • Router/Switch hardware or software error
      • Congestion
          • Reduce congestion impact by QoS deployment
          • Avoid packet drop for high sensitive TCP application
  • 21. Improvement – Network approach
      • Packet forward process without QoS
        • Tail-drop
          • 網路設備 hardware queue 因線路擁塞而被佔滿,在無法容納更多待傳送封包後直接將待傳送封包丟棄。
          • Hardware queue 無法判斷 packet priority ,一但發生 queue 塞滿的情況時則無差別的將封包丟棄。
            • 此類情況即為 Tail-drop
          • 要儘量避免發生 Tail-drop 情況。
  • 22. Improvement – Network approach
      • Packet forward process with QoS
        • 先使用不同的 logical queue 來存放 priority 不同的封包,再置入 h/w queue 中。在 H/W queue 塞滿之前,主動丟棄某些暫存於 low priority queue 的封包,防止 Tail-drop 情況發生。
          • RED – Random Early Detection
          • WRED – Weighted Random Early Detection
  • 23. Improvement – Network approach
    • Reduce out-of-order packets
      • 避免同一 TCP session 走在不同的 path 上
        • Per-packet load-sharing
          • Load-sharing by destination IP only
        • Per-flow load-sharing
          • Load-sharing by IP packet hash value. Hash index includes:
            • Source IP 、 Destination IP
            • Protocol
            • Source Port 、 Destination Port
          • 有著相同 hash 值的封包會走相同的 next-hop interface ,避免 packet out-of-order 情況發生。
      • TCP 實作 Selective Acknowledgements
        • RFC2018
        • RFC2883
  • 24. Improvement – Appliance approach
    • Operating System has to handle TCP session routine
      • It’s CPU/Memory dependent
    • Huge TCP session will occupy system resource like CPU cycles and memory utilization, and shrink the real service processes in asking CPU/Memory
    • Reduce system resource consumption in TCP session handling
      • TCP Offload
      • TCP Optimization
  • 25. Improvement – Appliance approach
    • TCP Offload
      • Migrate TCP handling out of kernel
        • Use dedicate hardware to handle TCP
        • Save system resource for real service processes
      • TOE (TCP Offload Engine) NIC
        • Handle TCP/IP on NIC
  • 26. Improvement – Appliance approach
    • TCP Offload
      • NIC w/o TOE and NIC w/ TOE comparison
  • 27. Improvement – Appliance approach
    • TCP Offload
      • TOE is wide deployed in iSCSI environment
        • iSCSI:
  • 28. Improvement – Appliance approach
    • TCP Optimization
      • Migrate huge TCP session out of system
      • For any TCP session, 3-way handshaking and 4-way handshaking is necessary
        • 3-way handshaking for TCP connection establishment
        • 4-way handshaking for TCP connection termination
      • Reduce TCP connection number will reduce connection “overhead”
        • Deploy dedicate hardware in the front of servers
  • 29. Improvement – Appliance approach
    • TCP Optimization
      • Regular TCP connection
    Client Server SYN ACK SYN+ACK GET FIN ACK ACK Data Data Data FIN
  • 30. Improvement – Appliance approach
    • TCP Optimization
      • Reduce server TCP connection number
        • Only ONE 3-way handshaking is necessary in early stage
    Client Server TCP Proxy SYN ACK SYN+ACK GET FIN ACK ACK Data Data Data GET Data Data Data FIN
  • 31. Improvement – Appliance approach
    • TCP Optimization
      • 現實環境中很少僅用來改善 TCP 效能
        • 多搭配其它功能
        • L4~L7 load-balance
      • 由於 Client TCP connection end-to-end 是建立在 TCP Proxy 上,更多其它功能可以被加入
        • SSL 加速
        • Reverse cache
  • 32. Reference
    • Books
      • High-Speed Networks and Internets – Performance and Quality of Service, 2nd Ed.
        • By William Stallings ; Prentice Hall
      • High Performance TCP/IP Networking – Concepts, Issues and Solutions
        • By Mahbub Hassan and Raj Jain ; Pearson Prentice Hall
      • TCP/IP Illustrated, Volume 1
        • By W. Richard Stevens ; Addison Wesley
    • Articles
      • TCP Performance
        • By Geoff Huston ; The Internet Protocol Journal - Volume 3, No. 2
      • A very good “sliding window” description
        • http://www.it.uu.se/edu/course/homepage/datakom/civinght04/schema/sliding_window.pps
  • 33. Q & A