7 tcp-congestion

1,762 views

Published on

More details on the TCP protocol including some security issues with TCP and introduction of congestion control

Published in: Engineering
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,762
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
121
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Urgent pointer is rarely used and will not be described.

    The THL is indicated in blocs of 32 bits. The TCP header may contain options, these will be discussed later.
  • MSL in IP networks : 120 seconds
  • MSL in IP networks : 120 seconds
  • The computation of TCP’s retransmission timer is described in

    RFC2988 Computing TCP's Retransmission Timer. V. Paxson, M. Allman. November 2000.

    Usual values for alpha and beta are 1/8 and 1/4.
  • See

    P. Karn, C. Partridge, Improving round-trip time estimates in reliable transport protocols, Proc. ACM SIGCOMM87, August 1987
  • Les timestamps TCP ont étés introduits dans :

    RFC1323 TCP Extensions for High Performance. V. Jacobson, R. Braden, D. Borman. May 1992.

    L'utilisation de ces timestamps est négociée lors de l'établissement de la connexion TCP. La plupart des implémentations TCP actuelles supportent ces extensions.
  • See e.g.

    RFC2001 TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms. W. Stevens. January 1997.

  • RFC2018 TCP Selective Acknowledgement Options. M. Mathis, J. Mahdavi, S. Floyd, A. Romanow. October 1996.
  • Some heavily loaded web servers, use abrupt release to close their connection to avoid maintaining state for 2*MSL seconds.
  • Most TCP implementations today have fixes for those problems. We will discuss them later.
  • This utilization of a hash function to compute the value of the initial sequence number is usually called a SYN cookie.

    In practice, the computation of the SYN cookie is slightly more complex than a simple hash function because the server must also remember inside the cookie the following information :
    - the MSS value advertised by the client
    - the optional utilization of TCP options such as RFC1323 large windows or timestamps or SACK by the sender

    The original discussions that lead to the development of the SYN cookie solution may be found in :
    http://cr.yp.to/syncookies/archive
  • 7 tcp-congestion

    1. 1. Week 7 UDP and TCP SCTP and Internet Congestion control
    2. 2. Agenda • TCP • Connection establishment • Reliable data transfer • Connection release • SCTP • Congestion control
    3. 3. TCP segment 32 bits Source port Destination port THL Reserved Flags Window Checksum Urgent pointer Payload 20 bytes Sequence number Optional header extension Flags : used to indicate the function of a segment SYN : used during establishment FIN : used during connection release RST : used in case of problems ACK : if true, means that the Acknowledgement number inside the segment is valid Computed over the entire segment and part of the IP header Acknowledgement number Segment header length
    4. 4. Three-way handshake ACK(seq=x+1, ack=y+1) CONNECT.req CONNECT.ind SYN+ACK(ack=x+1,seq=y) CONNECT.resp Initial sequence number (x) CONNECT.conf Initial sequence number (y) SYN(seq=x) Connection established Connection established The sequence numbers of all segments A->B will start at x+1 The sequence numbers of all segments B->A will start at y+1
    5. 5. TCP FSM Init ?SYN / !SYN+ACK !SYN ?SYN / !SYN+ACK SYN RCVD SYN Sent Established ?SYN+ACK / !ACK ?ACK
    6. 6. Simultaneous open CONNECT.conf SYN(seq=y) CONNECT.req CONNECT.req SYN(seq=x) Connection established Connection established CONNECT.conf SYN+ACK(seq=y, ack=x+1) SYN+ACK(seq=x, ack=y+1)
    7. 7. Negotiating options ACK(seq=x+1, ack=y+1) CONNECT.req CONNECT.ind SYN+ACK(ack=x+1,seq=y) Option CONNECT.resp Initial sequence number (x) Option proposed CONNECT.conf Initial sequence number (y) Option accepted SYN(seq=x),Option Connection established Option accepted Connection established The sequence numbers of all segments A->B will start at x+1 The sequence numbers of all segments B->A will start at y+1
    8. 8. TCP options • MSS • Selective acknowledgements • Timestamps • Window Scale • Multipath TCP • ...
    9. 9. Agenda • TCP • Connection establishment • Reliable data transfer • Connection release • SCTP • Congestion control
    10. 10. Reliable data transfer (seq=123,"abcd") (seq=127,"ef") (seq=123,"abcd") (seq=127,"ef") (ack=123) Retransmission timer (ack=129) (ack=129) "abcdef" unnecessary retransmission Retransmission of all unacked segments “ef” placed in buffer
    11. 11. Retransmission timer • How to compute it ? • round-trip-time may change frequently during the lifetime of a TCP connection
    12. 12. Retransmission timer • Algorithm • timer = mean(rtt) + 4*std_dev(rtt) • est_mean(rtt) = (1- )*est_mean(rtt) + *rtt_measured • est_std_dev=(1-)*est_std_dev+ *|rtt_measured - est_mean(rtt)|
    13. 13. RTT measurements (seq=120,"xyz") (ack=123) • Solution (Karn/Partridge) • Do not measure rtt of retransmitted segments (seq=123,"abcd") (ack=128) measured rtt which is the good one ? Timer (seq=123,"abcd")
    14. 14. With Timestamp option (seq=120,TS=1, TS echo=7, "xyz") (ack=123, TS=12, TS echo=1) (seq=123,TS=3, TS echo=12, "abcd") (ack=127, TS=17, TS echo=3) measured rtt timer measured rtt (seq=123,TS=5, TS echo=12, "abcd")
    15. 15. Fast retransmit (seq=123,"abcd") (ack=123) (ack=123) (ack=123) (ack=123) (ack=133) (seq=123,"abcd") "abcdefghij" (seq=127,"ef") Out of sequence, in buffer (seq=129,"gh") Out of sequence, in buffer (seq=131,"ij") Out of sequence, in buffer
    16. 16. Selective Acks • Receiver reports SACK blocks • Negotiated during establishment (seq=123,"abcd") (ack=123) (seq=127,"ef") (ack=123,sack:127-128) (seq=129,"gh") (ack=123, sack:127-130) (seq=131,"ij") (ack=123, sack:127-132) Lost (seq=123,"abcd") (ack=133) "abcdefghij" only 123-126 must be retransmitted
    17. 17. Delayed acks • Sending an ack per segment is costly • Tradeoff • In sequence data segment • no ack waiting, delay by up to 50msec • one ack waiting, send immediately • Out-of-sequence data segment • send ack immediately
    18. 18. When to send data ? • When should a segment be sent ? • After each write system call • When there is a full segment of data
    19. 19. Nagle algorithm • A new data segment can be sent if • This is a full segment (MSS bytes) • There are no unacknowledged bytes
    20. 20. Observed IP packets http://www.caida.org/research/traffic-analysis/pkt_size_distribution/graphs.xml
    21. 21. Flow control (seq=122,"abcd") (ack=126,rwin=0) Last_ack=122, swin=100, rwin=4 To transmit : abcdefghijklm Last_ack=122, swin=96, rwin=0 Last_ack=126, swin=100, rwin=0 (ack=126,rwin=2) (seq=126,"ef") (ack=128,rwin=20) Last_ack=126, swin=100, rwin=2 Last_ack=126, swin=98, rwin=0 Last_ack=128, swin=100, rwin=20 Last_ack=128, swin=93, rwin=13 (seq=128,"ghijklm") (ack=135,rwin=20) Last_ack=135, swin=100, rwin=20
    22. 22. TCP flow control • Performance function of window size • Throughput ~= window/rtt • TCP window : 16 bits field rtt 1 msec 10 msec 100 msec Window 8 Kbytes 65.6 Mbps 6.5 Mbps 0.66 Mbps 64 Kbytes 524.3 Mbps 52.4 Mbps 5.2 Mbps • RFC1323 Window scale extension
    23. 23. Agenda • TCP • Connection establishment • Reliable data transfer • Connection release • SCTP • Congestion control
    24. 24. Connection release FIN(seq=x) DISCONNECT.req (A-B) DISCONNECT.ind(A-B) ACK(ack=x+1) DISCONNECT.conf(A-B) ACK(ack=y+1) DISCONNECT.req(B-A) DISCONNECT.conf(A-B) outgoing connection closed DISCONNECT.ind(B-A) FIN(seq=y) Time WAIT Maintain state for this connection during twice MSL to be able to retransmit ACK if a segment is received from the other entity incoming connection closed incoming connection closed outgoing connection closed State can be removed Last sent data : x-1 Last sent data : y-1
    25. 25. Abrupt release RST(seq=x) DISCONNECT.req (abrupt) DISCONNECT.ind(abrupt) Connection closed Connection closed State can be removed State can be removed Last sent data : x • Data segments can be lost during such an abrupt release • No entity needs to wait in TIME_WAIT state after such a release • anyway, any segment received when there is no state causes the transmission of a RST segment
    26. 26. TCP connection release SYN RCVD FIN Wait1 ?FIN/!ACK CLOSE Wait Established FIN Wait2 !FIN LAST-ACK Closing TIME Wait ?ACK Closed Timeout[2MSL] ?FIN/!ACK ?ACK !FIN ?ACK ?FIN/!ACK !FIN
    27. 27. Agenda • TCP • Connection establishment • Reliable data transfer • Connection release • SCTP • Congestion control
    28. 28. TCP limitations • Service • Only supports bytestream service • Extensibility • Limited space for options • Security • Various issues like Denial of Service attacks
    29. 29. TCP establishment SYN(Src=C,seq=x) CONNECT.ind SYN+ACK(Dest=C,ack=x+1,seq=y) ACK(Src=A,seq=x) CONNECT.req
    30. 30. DoS attack • Attacker sends 1000s of SYNs SYN(Src=A,seq=x) CONNECT.ind CONNECT.ind SYN+ACK(Dest=A,ack=x+1,seq=y) SYN(Src=B,seq=x) SYN+ACK(Dest=B,ack=x+1,seq=z)
    31. 31. TCP Security • 20th century security • Server trusts Alice but not Bob • Server accepts all TCP connections from Alice's IP address without asking a password • Server always asks a password from Bob's IP address
    32. 32. TCP Security • Can Bob create a fake TCP connection by spoofing Alice's IP when she is away ? SYN(seq=x) SYN+ACK(ack=x+1,seq=y) ACK(seq=x+1, ack=y+1) CONNECT.req CONNECT.ind CONNECT.res p CONNECT.conf
    33. 33. TCP Security • Bob's view of the transfer SYN(Src=A,seq=x) SYN+ACK(Dst=A,ack=x+1,seq=y) ACK(seq=x+1, ack=y+1) Data(Src=A,seq=x+1)
    34. 34. SYN Cookies SYN(seq=x) SYN+ACK(ack=x+1,seq=y) ACK(seq=x+1, ack=y+1) CONNECT.req CONNECT.ind CONNECT.conf No state created y=Hash(IPClient,PortClient,Secret) Verify that ack=1+Hash(IPClient,PortClient,Secret) State is created • Stateless passive opener
    35. 35. SCTP • Segment format
    36. 36. SCTP connection establishment
    37. 37. Agenda • TCP • Connection establishment • Reliable data transfer • Connection release • SCTP • Congestion control
    38. 38. TCP Congestion Control • Congestion detection • Packet loss • Explicit Congestion Notification • Congestion control • Additive Increase Multiplicative Decrease
    39. 39. Additive Increase • No congestion ? • All acks move window • Additive increase • Increment cwnd by on MSS every rtt Cwnd Time
    40. 40. • HowF toa sspeteed urp ithne cgrrowetha osf thee congestion window at connection startup ? • Slow-start • Double cwnd every rtt Cwnd Slow-start exponential increase of cwnd Time Max window
    41. 41. Multiplicative • How to detdecte cocngresetioan ?se • Three duplicate acks • mild congestion for TCP • cwnd/2 and restart additive increase • Expiration of retransmission timer • severe congestion • Reset cwnd at 1 MSS • Perform slow-start until half previous cwnd and then continue with congestion avoidance
    42. 42. Cwnd Mild congestion Fast retransmit Threshold Fast retransmit Threshold Slow-start exponential increase of cwnd Congestion avoidance linear increase of cwnd
    43. 43. Severe congestion Cwnd Time Timer expiration Threshold Timer expiration Threshold Slow-start exponential increase of cwnd Congestion avoidance linear increase of cwnd

    ×