SlideShare a Scribd company logo
1 of 42
Download to read offline
2002/11/12 Roberto Innocente 1
Transport protocols
and
the Web
Roberto Innocente
inno@sissa.it
for the Abdus Salam ICTP workshop on
Web enabling technologies
http://www.ictp.trieste.it/~its/2002/index.html
2002/11/12 Roberto Innocente 2
Intro
• Transport protocols are usually devised to
optimize movements of large quantities of
data and persistent connections
• The most important transport protocol in use
today, TCP, is no exception
• Unfortunately or fortunately the Web in the
last years has changed traffic patterns. Most
of the traffic is now constituted by many small
(some KB or tens of KBs) transfers performed
in bursts.
2002/11/12 Roberto Innocente 3
TCP flow control
• It is necessary to avoid that that the receiver is
flooded by the sender
• TCP as many other protocols uses a sliding window
mechanism
• in every TCP packet a receiver advertised window
(rwnd) is communicated and that is the maximum
amount of data the sender can send without waiting
for an ACK
• originally this was a 16 bit quantity (up to 64k) now an
initial option can specify a multiplier up to 2**14
2002/11/12 Roberto Innocente 4
Congestive collapses
As reported by Van Jacobson in his famous paper, in
october 1986 the Internet had one of its first
congestive collapses.
Suddenly after a congestion of a 32 kb/s link, while the
traffic remained high because of retransmissions, the
valuable data that traversed the link (goodput) was
reduced by almost a factor of 1/1000 to something
like 40 bps.
The study of what happened at that time lead to the
development of the tcp congestion control algorithms
devised by Van Jacobson at the end of the ’80.
2002/11/12 Roberto Innocente 5
TCP congestion control
• It had an importance
similar to that the
Watt’s Flyball
regulator had for the
steam engine
• To adapt the
protocol to the
changing loads of
real world
2002/11/12 Roberto Innocente 6
TCP Congestion control
• RFC2581 TCP congestion control,
Allman,Paxson, Stevens 1999
• it comprises the following algorithms :
– Slow start
– Congestion avoidance
– Fast retransmit
– Fast recovery
2002/11/12 Roberto Innocente 7
RTT (Round Trip Time)
RTO(Retransmission timeout)
• M = measured RTT
• R = smoothed RTT
estimator
• Original
specification:
– R = 0.9 R + 0.1 M,
RTO = 2 R
• Jacobson 1988
– A = average RTT
– Err = M – A
– A = A + 0.125 Err
– D = D + 0.25 (|Err|-D)
(mean deviation instead
of std deviation to save
time on computing sqrt)
– RTO = A + 4 D, at start
A=0 and D= 3 sec and
RTO= A+2 D (only for the
first SYN pkt)
2002/11/12 Roberto Innocente 8
Congestion control
parameters
• cwnd congestion window is the maximum number of tcp
segments a tcp sender can send in one RTT period, it is the
parameter that limits the sending rate
• ssthresh slow start threshold, it is a threshold for the cwnd
parameter below which the slow start algorithm is applied and
over which the congestion avoidance algorithm is instead used
• we have already seen that rwnd is instead the maximum amount
of unack data the sender is willing to accept
• at any time in an RTT period no more than
– min(rwnd,cwnd)
packets can be injected into the network
2002/11/12 Roberto Innocente 9
TCP Slow Start
• ssthresh is initialized to the
advertised window
• Initially cwnd is initialized to
a constant that RFC2581
suggests to be 1 or at most 2
• Then each RTT cwnd is
doubled, to obtain an
exponential increase of it
until a loss or ssthresh is
reached. Usually this is
obtained incrementing cwnd
by 1 for each ACK received
• When a loss is detected
– cwnd = cwnd / 2,
– ssthresh = cwnd
pkt
ack
t
cwnd1 2 4 8
2002/11/12 Roberto Innocente 10
TCP acks
• Acknowlegements in TCP are cumulative,
they acknowledge all previously received
data. Because of this the mechanism is very
robust.
• RFC2581 requires to send an ACK at least
every second full segment and to implement
the delayed ack algorithm to try to piggyback
the ACK to a data pkt in the other direction.
In any case the ACK must be sent within
500ms since the arrival of the data
2002/11/12 Roberto Innocente 11
TCP Congestion Avoidance
• The congestion avoidance algorithm is
applied when :
– cwnd >= ssthresh
the congestion window in this case is
increased linearly by 1 every RTT.
In practice this is done increasing cwnd by
1/cwnd every ACK.
If a RTO happens then cwnd = 1 and slow
start is performed
2002/11/12 Roberto Innocente 12
TCP Fast retransmit
• In the original specification only the expiry of the RTO (Retransmission
Time Out) timer induced the retransmission of a segment
• To avoid such a long delay a mechanism was devised to recover from
the loss of a single tcp segment as it occurs when probing for
congestion
• a TCP receiver should immediately send a duplicate ACK when it
receives an out of order segment (this can be due also to some
network reordering of segments)
• a TCP sender should detect the arrival of 3 duplicate ACKs (the original
+ 3 dups) and retransmit the segment immediately following
• the sender sets the
– ssthresh = cwnd/2
– cwnd = ssthresh + 3
( here there are some little differences with the RFC to simplify the
discussion, refer to RFC2581 for details)
2002/11/12 Roberto Innocente 13
TCP Fast Recovery
• This algorithm allows a TCP to continue
with the congestion avoidance algorithm
after detecting and repairing a single
loss with fast retransmit.
2002/11/12 Roberto Innocente 14
Congestion control in TCP
from [Balakrishnan 98]
2002/11/12 Roberto Innocente 15
Restart of Idle connections
• The suggested behaviour after an idle period (more
than a RTO without exchanges) is to reset the
congestion window to its initial value (usually
cwnd=1) and apply slowstart
• Solaris doesnt observe the idle time and doesnt apply
”slow start restart”
• Rate Based Pacing (RBP): Improving restart of idle
TCP connections ,Viesweswaraiah, Heidemann
(1997) suggests to reuse the previous cwnd , but to
smoothly pace out pkts, rather than burst them out,
using a Vegas like algorithms to estimate the rate
2002/11/12 Roberto Innocente 16
A frequently observed
TCP stall
• If a client needs to send more than a single
segment to complete a request then :
– it will send the 1st segment because cwnd=1
– the receiver will not ack immediately because of
the delayed ack argument, and will not reply
because the request is not complete. So it will wait
something up to 500 ms to send the ACK
– only then the client will send other segment and
updates is cwnd
2002/11/12 Roberto Innocente 17
Why the stall is not so
frequent ?
• A famous bug in Windows/NT and BSD
derived implementations consisted in
considering the ACK of the 3-way tcp
initial handshake as a signal of data
succesfully transferred
• Solaris doesnt use delayed ack during
slow start
2002/11/12 Roberto Innocente 18
TCP stacks
• Tahoe(4.3BSD,Net/1) : implements Van Jacobson
slow start/congestion avoidance and fast retransmit
algorithms
• Reno(1990,Net/2) : fast recovery, TCP header
prediction
• Net/3(4.4BSD 1994): multicasting, LFN (Long Fat
Networks) mods
• NewReno : fast retransmit even with multiple losses
(Hoe 1997)
• Vegas: experimental stack (Brakmo, Peterson 1995)
2002/11/12 Roberto Innocente 19
TCP Reno
• It is the most widely referenced
implementation, basic mechanisms are
the same also in Net/3 and NewReno
• It is biased against connections with
long delays (large RTT) : in this case
the increase of the cwnd happens
slowly and so they obtain less avg b/w
2002/11/12 Roberto Innocente 20
LFN Optimal Bandwidth
and TCP Reno
• LFNs are Long Fat Networks, networks for which the
bandwidth delay product is high
• Let’s consider a 1 Gb/s WAN connection with a RTT
of 100 ms (BW*rtt ~12 MB)
• Optimal congestion window would be about 8.000
segments (1.5k)
• When there will be a loss cwnd will be decreased to
4.000 segments
• It will take 400 seconds (7minutes!) to recover to the
optimal b/w. This is awful if the network is not
congested !
2002/11/12 Roberto Innocente 21
TCP Vegas
• Approaches congestion but tries to avoid it. Tries to estimate the
queueing at intermediate routers looking at b/w variations.
BaseRTT is the least RTT measured. Then
– Expected = CongestionWindow/BaseRTT
– Diff = Actual – Expected
– Then if Diff < α congestion window is increased else if Diff > β
congestion windows is decreased
• It has been shown that it is able to better utilize the available b/w
(30% better than Reno)
• It is fair with long delay connections
• Anyway, when mixed with TCP Reno it gets an unfair share
(50% less) of the bandwidth
2002/11/12 Roberto Innocente 22
TCP Connection establishment
Client
active open
Server
passive open 3-way handshake :
• SYN : client ->
server
• SYN-ACK: server ->
client
• ACK : client ->
server
1 RTT
SYN
SYN+ACK
ACK
2002/11/12 Roberto Innocente 23
TCP circuit termination
active close passive close
TCP allows half-open
connections.
Each direction is
independent.
FIN
1 RTT
FIN
ACK
ACK
2002/11/12 Roberto Innocente 24
T/TCP (Transaction TCP) /1
• Standard TCP provides a virtual circuit
service with connection establishment,
data transfer, circuit termination phases.
• Some application however are built
around a simpler transaction service :
a client makes a request and the server
answers.
2002/11/12 Roberto Innocente 25
T/TCP (Transaction TCP)/2
• For these applications the overhead of the 3-
way initial TCP handshake should be avoided
• Then T/TCP introduces a TCP accelerated
open (TAO)
• This TCP extension avoids 3-way handshake
• Request and reply is sent together with
connection messages
• Servers cache a CC (connection count) for
clients to detect duplicate requests.
• Latency = RTT + server processing time
2002/11/12 Roberto Innocente 26
T/TCP (Transaction TCP)/3
SYN,FIN,data,ACK
SYN,data,FIN
• There are many
concerns about the
security of this
protocol due to the
easy flooding of
servers that is
possible spoofing
the IP address
ACK
2002/11/12 Roberto Innocente 27
Mobile and Satellite Links
• As we have seen TCP interprets packet losses as a
signal of congestion and therefore halves its sending
rate.
• This is frequently inappropriate for lossy channels as
satellite links or wireless links.
• There are essentially 2 kinds of proposals to
overcome this problem :
– to use a link layer protocol as ARQ that hides the real
behaviour of the link making it as reliable as ordinary links :
snoop
– to split the tcp connection between the 2 different networks
(using on the wireless network a special implementation or
an ad hoc protocol) : Indirect TCP: I-TCP
2002/11/12 Roberto Innocente 28
Split TCP/
I-TCP (Indirect TCP)
• First proposed in the
mid ’90
• Based on the idea that
a connection can be
established over 2
completely different
networks : wired and
wireless
• ACKs are not end-to-
end and this breaks the
end-2-end property of
TCP
GW
std
TCP
Host
Non
std
protocolMobile
Host
2002/11/12 Roberto Innocente 29
HTTP
• ver. 1.0 opens a TCP conection for each URI
retrieved : retrieves first URI and then tries to
get all connected URI using simultaneous
separated TCP connections. Creates bursts
on the net, incurs startup overhead
• ver. 1.1 persistent connections and pipelining
: good because only one transport connection
and so it can be trained, but slow because it
serializes transfers
2002/11/12 Roberto Innocente 30
Session Control Protocol
(SCP)
• devised by Simon Spero UNC
• several common applications like ftp, http use
for every transaction a separate TCP
connection, while in fact many requests are
done to the same server. This is inefficient
and the applications incur all the startup costs
• SCP is a simple protocol that lets a server
and a client to have multiple conversations
over a single TCP connection.
• has a fixed overhead of 8 bytes per segment
2002/11/12 Roberto Innocente 31
Simple Multiplexing Protocol
• Frequently referred as MUX (Gettys,Nielsen 1997)
many ideas derived from SCP (session control
protocol)
• It separates the transport protocol as for example
TCP from the upper level application protocol
inserting a lightweight session multiplexing protocol
(can manage up to 2**8 sessions) by which an
application can have multiple ongoing (e.g. HTTP)
transactions over a single transport level connection
• Differently from SCP can have only 4 bytes of
overhead, and can multiplex different protocols
2002/11/12 Roberto Innocente 32
TCB (TCP control blocks)
interdependence
• proposal to share TCP control block
between the different connections to the
same host, such that for example
congestion windows, RTTs, ssthresh
dont need to be recalculated and
startup times can be avoided
2002/11/12 Roberto Innocente 33
Connection managers
• There have been proposals to share
TCP parameters (RTT, b/w available, ..)
between more hosts (e.g. a company)
setting up a server that caches those
values for the different networks
2002/11/12 Roberto Innocente 34
AIMD / AIPD
Congestion control
• AIMD (Additive Increase
Multiplicative Decrease) :
– W = W + a (no loss)
– W = W * (1 – b)
(L > 0 losses)
it achieves a b/w ∝1/sqrt(L)
• AIPD (Additive Increase
Proportional Decrease):
– W = W + a (no loss)
– W = W * (1 –b *L)
(L > 0 losses)
it achieves a b/w ∝1/L
(ns2 plot of 3 flows,source Lee 2001)
2002/11/12 Roberto Innocente 35
TCP buffer tuning
• The optimal TCP flow control window size is the bandwidth delay
product for the link. In these years as network speeds have increased,
o.s. have adjusted the default buffer size from 8KB to 64KB. This is far
too small for hpn. An OC12 link with 50 ms rtt delay requires at least
3.75 MB of buffers ! There is a lot of reseach on mechanisms to
automatically set an optimal flow control window and buffer size, just
some examples :
– Auto-tuning (Semke,Mahdavi,Mathis 1998)
– Dynamic Right Sizing(DRS) :
• Weigl, Feng 2001 : Dynamic Right-Sizing: A Simulation study .
– Enable (Tierney et al LBNL 2001) : database of BW-delay products
– Linux 2.4 autotuning/connection caching: controlled by the new kernel
variables net.ipv4.tcp_wmem/tcp_rmem, the advertised receive window
starts small and grows with each segment from the transmitter; tcp control
info for a destination are cached for 10 minutes(cwnd,rtt,rttvar,sshthresh)
2002/11/12 Roberto Innocente 36
RED /1
(Random Early Detection)
• Floyd, V.Jacobson 1993
• Detects incipient congestion
and drops packets with a certain
probability depending on the
queue length. The drop
probability increases linearly
from 0 to maxp as the queue
length increases from minth to
maxth. When the queue length
goes above maxth (maximum
threshold) all pkts are dropped.
• Used together with ECN this
algorithm can just signal ECN
capable connections, instead of
dropping packets
2002/11/12 Roberto Innocente 37
RED /2
• It is now widely deployed
• There are very good performance reports
• Still an active area of research for possible
modifications
• Cisco implements its own WRED (Weighted
Random Early Detection) that discards
packets based also on precedence (provides
separate threshold and weights for different
IP precedences)
2002/11/12 Roberto Innocente 38
RED /3
• (from Van Jacobson, 1998) Traffic on a busy E1 (2
Mbit/s) Internet link. RED was turned on at 10.00 of
the 2nd day, and from then utilization rised up to
100% and remained there steady.
2002/11/12 Roberto Innocente 39
ECN (Explicit Congestion
Notification) /1
• RFC 3168 Ramakrishnan, Floyd, Black :
The addition of Explicit Congestion Notification (ECN) to IP (Sep
2001)
• Uses 2 bits of the IPv4 Tos (Now and in IPv6 reserved as a
DiffServ codepoint). These 2 bits encode the states :
– ECN-Capable Transport : ECT(0) and ECT(1)
– Congestion Experienced (CE)
• If the transport is TCP, uses 2 bits of the TCP header, next to
the Urgent flag :
– ECN-Echo(ECE) set in the first ACK after receiving a CE pkt
– Congestion Window Reduced (CWR) set in the first pkt after having
reduced cwnd in response to an ECE pkt
• With TCP the ECN is initialized sending a SYN pkt with ECE
and CWR on, and receiving a SYN+ACK pkt with ECE on
2002/11/12 Roberto Innocente 40
ECN /2
How it works:
• senders set ECT(0) or ECT(1) to indicate that the end-nodes of the
transport protocol are ECN capable
• a router experiencing congestion, sets the 2 bits of ECT pkts to the CE
state (instead of dropping the pkt)
• the receiver of the pkt signals back the condition to the other end
• the transports behave as in the case of a pkt drop (no more than once
in a RTT)
Linux adopted it on 2.4 ... many connection troubles (the default
now is to be off, can be turned on/off using :
/proc/sys/net/ipv4/tcp_ecn). Major sites are using firewalls or
load balancers that refuse connections from ECT (Cisco PIX
and Load Director, etc).
2002/11/12 Roberto Innocente 41
Bibliography /1
• Van Jacobson,M.Karel : Congestion
avoidance and control, 1988
• R.Stevens: TCP/IP Illustrated vol.1 The
protocols, 1994
• G.Wright, R.Stevens: TCP/IP Illustrated vol. 2
The Implementation, 1995
• RFC2581 TCP Congestion Control,Allman,
Paxson, Stevens (1999)
• RFC2018 TCP Selective Acknowledgements
Options, Mathis,Mahdavi, Floyd 1996
2002/11/12 Roberto Innocente 42
Bibliography /2
• Brakmo, Peterson: TCP Vegas End to End
congestion avoidance on a Global Internet, 1995
• M.Allman, On the generation and use of TCP
Acknowledgments, 1999
• L.Peterson,S.Davies : Computer networks,2000
• Balakrishnan,Seshan : Improving TCP/IP
performance over wireless networks, 1995
• Bakre,Badrinath: I-TCP: Indirect TCP for mobile host,
1994

More Related Content

What's hot

Mobile computing-tcp data flow control
Mobile computing-tcp data flow controlMobile computing-tcp data flow control
Mobile computing-tcp data flow controlSushant Kushwaha
 
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...The Linux Foundation
 
3g umts-originating-call Call Flow
3g umts-originating-call Call Flow3g umts-originating-call Call Flow
3g umts-originating-call Call FlowEduard Lucena
 
Tcp congestion control
Tcp congestion controlTcp congestion control
Tcp congestion controlAbdo sayed
 
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network partha pratim deb
 
Congestion control in tcp
Congestion control in tcpCongestion control in tcp
Congestion control in tcpsamarai_apoc
 
Network and TCP performance relationship workshop
Network and TCP performance relationship workshopNetwork and TCP performance relationship workshop
Network and TCP performance relationship workshopKae Hsu
 

What's hot (17)

Mobile computing-tcp data flow control
Mobile computing-tcp data flow controlMobile computing-tcp data flow control
Mobile computing-tcp data flow control
 
Transaction TCP
Transaction TCPTransaction TCP
Transaction TCP
 
Cubic
CubicCubic
Cubic
 
TCP Westwood
TCP WestwoodTCP Westwood
TCP Westwood
 
Tcp
TcpTcp
Tcp
 
Tcp (1)
Tcp (1)Tcp (1)
Tcp (1)
 
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
 
3g umts-originating-call Call Flow
3g umts-originating-call Call Flow3g umts-originating-call Call Flow
3g umts-originating-call Call Flow
 
Tcp congestion control
Tcp congestion controlTcp congestion control
Tcp congestion control
 
Ch11
Ch11Ch11
Ch11
 
Analysis of TCP variants
Analysis of TCP variantsAnalysis of TCP variants
Analysis of TCP variants
 
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
 
RTP.ppt
RTP.pptRTP.ppt
RTP.ppt
 
transport protocols
transport protocolstransport protocols
transport protocols
 
Congestion control in tcp
Congestion control in tcpCongestion control in tcp
Congestion control in tcp
 
Tcp congestion avoidance
Tcp congestion avoidanceTcp congestion avoidance
Tcp congestion avoidance
 
Network and TCP performance relationship workshop
Network and TCP performance relationship workshopNetwork and TCP performance relationship workshop
Network and TCP performance relationship workshop
 

Similar to features of tcp important for the web

Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Controlsoohyunc
 
TCP Over Wireless
TCP Over WirelessTCP Over Wireless
TCP Over WirelessFarooq Khan
 
presentationphysicallyer.pdf talked about computer networks
presentationphysicallyer.pdf talked about computer networkspresentationphysicallyer.pdf talked about computer networks
presentationphysicallyer.pdf talked about computer networksHetfieldLee
 
Mobile Transpot Layer
Mobile Transpot LayerMobile Transpot Layer
Mobile Transpot LayerMaulik Patel
 
Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Chandra Meena
 
Troubleshooting TCP/IP
Troubleshooting TCP/IPTroubleshooting TCP/IP
Troubleshooting TCP/IPvijai s
 
Computer network (16)
Computer network (16)Computer network (16)
Computer network (16)NYversity
 
Tcp congestion control (1)
Tcp congestion control (1)Tcp congestion control (1)
Tcp congestion control (1)Abdo sayed
 
Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Olivier Bonaventure
 
Computer network (5)
Computer network (5)Computer network (5)
Computer network (5)NYversity
 
Programming TCP for responsiveness
Programming TCP for responsivenessProgramming TCP for responsiveness
Programming TCP for responsivenessKazuho Oku
 
Ethernet as fabric
Ethernet as fabricEthernet as fabric
Ethernet as fabricMatthew Macy
 
Analytical Research of TCP Variants in Terms of Maximum Throughput
Analytical Research of TCP Variants in Terms of Maximum ThroughputAnalytical Research of TCP Variants in Terms of Maximum Throughput
Analytical Research of TCP Variants in Terms of Maximum ThroughputIJLT EMAS
 

Similar to features of tcp important for the web (20)

Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Control
 
Part9-congestion.pptx
Part9-congestion.pptxPart9-congestion.pptx
Part9-congestion.pptx
 
TCP Over Wireless
TCP Over WirelessTCP Over Wireless
TCP Over Wireless
 
6610-l14.pptx
6610-l14.pptx6610-l14.pptx
6610-l14.pptx
 
presentationphysicallyer.pdf talked about computer networks
presentationphysicallyer.pdf talked about computer networkspresentationphysicallyer.pdf talked about computer networks
presentationphysicallyer.pdf talked about computer networks
 
NE #1.pptx
NE #1.pptxNE #1.pptx
NE #1.pptx
 
Mobile Transpot Layer
Mobile Transpot LayerMobile Transpot Layer
Mobile Transpot Layer
 
Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc
 
Congestion control avoidance
Congestion control avoidanceCongestion control avoidance
Congestion control avoidance
 
Computer Networks Unit 5
Computer Networks Unit 5Computer Networks Unit 5
Computer Networks Unit 5
 
Troubleshooting TCP/IP
Troubleshooting TCP/IPTroubleshooting TCP/IP
Troubleshooting TCP/IP
 
Mobile comn.pptx
Mobile comn.pptxMobile comn.pptx
Mobile comn.pptx
 
TCP_Congestion_Control.ppt
TCP_Congestion_Control.pptTCP_Congestion_Control.ppt
TCP_Congestion_Control.ppt
 
Computer network (16)
Computer network (16)Computer network (16)
Computer network (16)
 
Tcp congestion control (1)
Tcp congestion control (1)Tcp congestion control (1)
Tcp congestion control (1)
 
Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6
 
Computer network (5)
Computer network (5)Computer network (5)
Computer network (5)
 
Programming TCP for responsiveness
Programming TCP for responsivenessProgramming TCP for responsiveness
Programming TCP for responsiveness
 
Ethernet as fabric
Ethernet as fabricEthernet as fabric
Ethernet as fabric
 
Analytical Research of TCP Variants in Terms of Maximum Throughput
Analytical Research of TCP Variants in Terms of Maximum ThroughputAnalytical Research of TCP Variants in Terms of Maximum Throughput
Analytical Research of TCP Variants in Terms of Maximum Throughput
 

More from rinnocente

Random Number Generators 2018
Random Number Generators 2018Random Number Generators 2018
Random Number Generators 2018rinnocente
 
Docker containers : introduction
Docker containers : introductionDocker containers : introduction
Docker containers : introductionrinnocente
 
An FPGA for high end Open Networking
An FPGA for high end Open NetworkingAn FPGA for high end Open Networking
An FPGA for high end Open Networkingrinnocente
 
WiFi placement, can we use Maxwell ?
WiFi placement, can we use Maxwell ?WiFi placement, can we use Maxwell ?
WiFi placement, can we use Maxwell ?rinnocente
 
TLS, SPF, DKIM, DMARC, authenticated email
TLS, SPF, DKIM, DMARC, authenticated emailTLS, SPF, DKIM, DMARC, authenticated email
TLS, SPF, DKIM, DMARC, authenticated emailrinnocente
 
Fpga computing
Fpga computingFpga computing
Fpga computingrinnocente
 
Refreshing computer-skills: markdown, mathjax, jupyter, docker, microkernels
Refreshing computer-skills: markdown, mathjax, jupyter, docker, microkernelsRefreshing computer-skills: markdown, mathjax, jupyter, docker, microkernels
Refreshing computer-skills: markdown, mathjax, jupyter, docker, microkernelsrinnocente
 
Nodes and Networks for HPC computing
Nodes and Networks for HPC computingNodes and Networks for HPC computing
Nodes and Networks for HPC computingrinnocente
 
Public key cryptography
Public key cryptography Public key cryptography
Public key cryptography rinnocente
 
End nodes in the Multigigabit era
End nodes in the Multigigabit eraEnd nodes in the Multigigabit era
End nodes in the Multigigabit erarinnocente
 
Mosix : automatic load balancing and migration
Mosix : automatic load balancing and migration Mosix : automatic load balancing and migration
Mosix : automatic load balancing and migration rinnocente
 
Comp architecture : branch prediction
Comp architecture : branch predictionComp architecture : branch prediction
Comp architecture : branch predictionrinnocente
 
Data mining : rule mining algorithms
Data mining : rule mining algorithmsData mining : rule mining algorithms
Data mining : rule mining algorithmsrinnocente
 
FPGA/Reconfigurable computing (HPRC)
FPGA/Reconfigurable computing (HPRC)FPGA/Reconfigurable computing (HPRC)
FPGA/Reconfigurable computing (HPRC)rinnocente
 
radius dhcp dot1.x (802.1x)
radius dhcp dot1.x (802.1x)radius dhcp dot1.x (802.1x)
radius dhcp dot1.x (802.1x)rinnocente
 

More from rinnocente (16)

Random Number Generators 2018
Random Number Generators 2018Random Number Generators 2018
Random Number Generators 2018
 
Docker containers : introduction
Docker containers : introductionDocker containers : introduction
Docker containers : introduction
 
An FPGA for high end Open Networking
An FPGA for high end Open NetworkingAn FPGA for high end Open Networking
An FPGA for high end Open Networking
 
WiFi placement, can we use Maxwell ?
WiFi placement, can we use Maxwell ?WiFi placement, can we use Maxwell ?
WiFi placement, can we use Maxwell ?
 
TLS, SPF, DKIM, DMARC, authenticated email
TLS, SPF, DKIM, DMARC, authenticated emailTLS, SPF, DKIM, DMARC, authenticated email
TLS, SPF, DKIM, DMARC, authenticated email
 
Fpga computing
Fpga computingFpga computing
Fpga computing
 
Refreshing computer-skills: markdown, mathjax, jupyter, docker, microkernels
Refreshing computer-skills: markdown, mathjax, jupyter, docker, microkernelsRefreshing computer-skills: markdown, mathjax, jupyter, docker, microkernels
Refreshing computer-skills: markdown, mathjax, jupyter, docker, microkernels
 
Nodes and Networks for HPC computing
Nodes and Networks for HPC computingNodes and Networks for HPC computing
Nodes and Networks for HPC computing
 
Public key cryptography
Public key cryptography Public key cryptography
Public key cryptography
 
End nodes in the Multigigabit era
End nodes in the Multigigabit eraEnd nodes in the Multigigabit era
End nodes in the Multigigabit era
 
Mosix : automatic load balancing and migration
Mosix : automatic load balancing and migration Mosix : automatic load balancing and migration
Mosix : automatic load balancing and migration
 
Comp architecture : branch prediction
Comp architecture : branch predictionComp architecture : branch prediction
Comp architecture : branch prediction
 
Data mining : rule mining algorithms
Data mining : rule mining algorithmsData mining : rule mining algorithms
Data mining : rule mining algorithms
 
Ipv6 course
Ipv6  courseIpv6  course
Ipv6 course
 
FPGA/Reconfigurable computing (HPRC)
FPGA/Reconfigurable computing (HPRC)FPGA/Reconfigurable computing (HPRC)
FPGA/Reconfigurable computing (HPRC)
 
radius dhcp dot1.x (802.1x)
radius dhcp dot1.x (802.1x)radius dhcp dot1.x (802.1x)
radius dhcp dot1.x (802.1x)
 

Recently uploaded

Washington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers ShirtWashington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers Shirtrahman018755
 
原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样AS
 
一比一原版英国格林多大学毕业证如何办理
一比一原版英国格林多大学毕业证如何办理一比一原版英国格林多大学毕业证如何办理
一比一原版英国格林多大学毕业证如何办理AS
 
A LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptx
A LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptxA LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptx
A LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptxthinamazinyo
 
一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书
一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书
一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书c6eb683559b3
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样ayvbos
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxResearch Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxi191686
 
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样AS
 
一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理
一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理
一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理apekaom
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理F
 
Sholinganallur (Chennai) Independent Escorts - 9632533318 100% genuine
Sholinganallur (Chennai) Independent Escorts - 9632533318 100% genuineSholinganallur (Chennai) Independent Escorts - 9632533318 100% genuine
Sholinganallur (Chennai) Independent Escorts - 9632533318 100% genuineruksarkahn825
 
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformonhackersuli
 
一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理F
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样AS
 
一比一原版美国北卡罗莱纳大学毕业证如何办理
一比一原版美国北卡罗莱纳大学毕业证如何办理一比一原版美国北卡罗莱纳大学毕业证如何办理
一比一原版美国北卡罗莱纳大学毕业证如何办理A
 
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...mikehavy0
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 

Recently uploaded (20)

Washington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers ShirtWashington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers Shirt
 
原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样
 
一比一原版英国格林多大学毕业证如何办理
一比一原版英国格林多大学毕业证如何办理一比一原版英国格林多大学毕业证如何办理
一比一原版英国格林多大学毕业证如何办理
 
A LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptx
A LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptxA LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptx
A LOOK INTO NETWORK TECHNOLOGIES MAINLY WAN.pptx
 
一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书
一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书
一比一原版(NYU毕业证书)美国纽约大学毕业证学位证书
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxResearch Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
 
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
 
一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理
一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理
一比一原版桑佛德大学毕业证成绩单申请学校Offer快速办理
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
Sholinganallur (Chennai) Independent Escorts - 9632533318 100% genuine
Sholinganallur (Chennai) Independent Escorts - 9632533318 100% genuineSholinganallur (Chennai) Independent Escorts - 9632533318 100% genuine
Sholinganallur (Chennai) Independent Escorts - 9632533318 100% genuine
 
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
 
一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
 
一比一原版美国北卡罗莱纳大学毕业证如何办理
一比一原版美国北卡罗莱纳大学毕业证如何办理一比一原版美国北卡罗莱纳大学毕业证如何办理
一比一原版美国北卡罗莱纳大学毕业证如何办理
 
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 

features of tcp important for the web

  • 1. 2002/11/12 Roberto Innocente 1 Transport protocols and the Web Roberto Innocente inno@sissa.it for the Abdus Salam ICTP workshop on Web enabling technologies http://www.ictp.trieste.it/~its/2002/index.html
  • 2. 2002/11/12 Roberto Innocente 2 Intro • Transport protocols are usually devised to optimize movements of large quantities of data and persistent connections • The most important transport protocol in use today, TCP, is no exception • Unfortunately or fortunately the Web in the last years has changed traffic patterns. Most of the traffic is now constituted by many small (some KB or tens of KBs) transfers performed in bursts.
  • 3. 2002/11/12 Roberto Innocente 3 TCP flow control • It is necessary to avoid that that the receiver is flooded by the sender • TCP as many other protocols uses a sliding window mechanism • in every TCP packet a receiver advertised window (rwnd) is communicated and that is the maximum amount of data the sender can send without waiting for an ACK • originally this was a 16 bit quantity (up to 64k) now an initial option can specify a multiplier up to 2**14
  • 4. 2002/11/12 Roberto Innocente 4 Congestive collapses As reported by Van Jacobson in his famous paper, in october 1986 the Internet had one of its first congestive collapses. Suddenly after a congestion of a 32 kb/s link, while the traffic remained high because of retransmissions, the valuable data that traversed the link (goodput) was reduced by almost a factor of 1/1000 to something like 40 bps. The study of what happened at that time lead to the development of the tcp congestion control algorithms devised by Van Jacobson at the end of the ’80.
  • 5. 2002/11/12 Roberto Innocente 5 TCP congestion control • It had an importance similar to that the Watt’s Flyball regulator had for the steam engine • To adapt the protocol to the changing loads of real world
  • 6. 2002/11/12 Roberto Innocente 6 TCP Congestion control • RFC2581 TCP congestion control, Allman,Paxson, Stevens 1999 • it comprises the following algorithms : – Slow start – Congestion avoidance – Fast retransmit – Fast recovery
  • 7. 2002/11/12 Roberto Innocente 7 RTT (Round Trip Time) RTO(Retransmission timeout) • M = measured RTT • R = smoothed RTT estimator • Original specification: – R = 0.9 R + 0.1 M, RTO = 2 R • Jacobson 1988 – A = average RTT – Err = M – A – A = A + 0.125 Err – D = D + 0.25 (|Err|-D) (mean deviation instead of std deviation to save time on computing sqrt) – RTO = A + 4 D, at start A=0 and D= 3 sec and RTO= A+2 D (only for the first SYN pkt)
  • 8. 2002/11/12 Roberto Innocente 8 Congestion control parameters • cwnd congestion window is the maximum number of tcp segments a tcp sender can send in one RTT period, it is the parameter that limits the sending rate • ssthresh slow start threshold, it is a threshold for the cwnd parameter below which the slow start algorithm is applied and over which the congestion avoidance algorithm is instead used • we have already seen that rwnd is instead the maximum amount of unack data the sender is willing to accept • at any time in an RTT period no more than – min(rwnd,cwnd) packets can be injected into the network
  • 9. 2002/11/12 Roberto Innocente 9 TCP Slow Start • ssthresh is initialized to the advertised window • Initially cwnd is initialized to a constant that RFC2581 suggests to be 1 or at most 2 • Then each RTT cwnd is doubled, to obtain an exponential increase of it until a loss or ssthresh is reached. Usually this is obtained incrementing cwnd by 1 for each ACK received • When a loss is detected – cwnd = cwnd / 2, – ssthresh = cwnd pkt ack t cwnd1 2 4 8
  • 10. 2002/11/12 Roberto Innocente 10 TCP acks • Acknowlegements in TCP are cumulative, they acknowledge all previously received data. Because of this the mechanism is very robust. • RFC2581 requires to send an ACK at least every second full segment and to implement the delayed ack algorithm to try to piggyback the ACK to a data pkt in the other direction. In any case the ACK must be sent within 500ms since the arrival of the data
  • 11. 2002/11/12 Roberto Innocente 11 TCP Congestion Avoidance • The congestion avoidance algorithm is applied when : – cwnd >= ssthresh the congestion window in this case is increased linearly by 1 every RTT. In practice this is done increasing cwnd by 1/cwnd every ACK. If a RTO happens then cwnd = 1 and slow start is performed
  • 12. 2002/11/12 Roberto Innocente 12 TCP Fast retransmit • In the original specification only the expiry of the RTO (Retransmission Time Out) timer induced the retransmission of a segment • To avoid such a long delay a mechanism was devised to recover from the loss of a single tcp segment as it occurs when probing for congestion • a TCP receiver should immediately send a duplicate ACK when it receives an out of order segment (this can be due also to some network reordering of segments) • a TCP sender should detect the arrival of 3 duplicate ACKs (the original + 3 dups) and retransmit the segment immediately following • the sender sets the – ssthresh = cwnd/2 – cwnd = ssthresh + 3 ( here there are some little differences with the RFC to simplify the discussion, refer to RFC2581 for details)
  • 13. 2002/11/12 Roberto Innocente 13 TCP Fast Recovery • This algorithm allows a TCP to continue with the congestion avoidance algorithm after detecting and repairing a single loss with fast retransmit.
  • 14. 2002/11/12 Roberto Innocente 14 Congestion control in TCP from [Balakrishnan 98]
  • 15. 2002/11/12 Roberto Innocente 15 Restart of Idle connections • The suggested behaviour after an idle period (more than a RTO without exchanges) is to reset the congestion window to its initial value (usually cwnd=1) and apply slowstart • Solaris doesnt observe the idle time and doesnt apply ”slow start restart” • Rate Based Pacing (RBP): Improving restart of idle TCP connections ,Viesweswaraiah, Heidemann (1997) suggests to reuse the previous cwnd , but to smoothly pace out pkts, rather than burst them out, using a Vegas like algorithms to estimate the rate
  • 16. 2002/11/12 Roberto Innocente 16 A frequently observed TCP stall • If a client needs to send more than a single segment to complete a request then : – it will send the 1st segment because cwnd=1 – the receiver will not ack immediately because of the delayed ack argument, and will not reply because the request is not complete. So it will wait something up to 500 ms to send the ACK – only then the client will send other segment and updates is cwnd
  • 17. 2002/11/12 Roberto Innocente 17 Why the stall is not so frequent ? • A famous bug in Windows/NT and BSD derived implementations consisted in considering the ACK of the 3-way tcp initial handshake as a signal of data succesfully transferred • Solaris doesnt use delayed ack during slow start
  • 18. 2002/11/12 Roberto Innocente 18 TCP stacks • Tahoe(4.3BSD,Net/1) : implements Van Jacobson slow start/congestion avoidance and fast retransmit algorithms • Reno(1990,Net/2) : fast recovery, TCP header prediction • Net/3(4.4BSD 1994): multicasting, LFN (Long Fat Networks) mods • NewReno : fast retransmit even with multiple losses (Hoe 1997) • Vegas: experimental stack (Brakmo, Peterson 1995)
  • 19. 2002/11/12 Roberto Innocente 19 TCP Reno • It is the most widely referenced implementation, basic mechanisms are the same also in Net/3 and NewReno • It is biased against connections with long delays (large RTT) : in this case the increase of the cwnd happens slowly and so they obtain less avg b/w
  • 20. 2002/11/12 Roberto Innocente 20 LFN Optimal Bandwidth and TCP Reno • LFNs are Long Fat Networks, networks for which the bandwidth delay product is high • Let’s consider a 1 Gb/s WAN connection with a RTT of 100 ms (BW*rtt ~12 MB) • Optimal congestion window would be about 8.000 segments (1.5k) • When there will be a loss cwnd will be decreased to 4.000 segments • It will take 400 seconds (7minutes!) to recover to the optimal b/w. This is awful if the network is not congested !
  • 21. 2002/11/12 Roberto Innocente 21 TCP Vegas • Approaches congestion but tries to avoid it. Tries to estimate the queueing at intermediate routers looking at b/w variations. BaseRTT is the least RTT measured. Then – Expected = CongestionWindow/BaseRTT – Diff = Actual – Expected – Then if Diff < α congestion window is increased else if Diff > β congestion windows is decreased • It has been shown that it is able to better utilize the available b/w (30% better than Reno) • It is fair with long delay connections • Anyway, when mixed with TCP Reno it gets an unfair share (50% less) of the bandwidth
  • 22. 2002/11/12 Roberto Innocente 22 TCP Connection establishment Client active open Server passive open 3-way handshake : • SYN : client -> server • SYN-ACK: server -> client • ACK : client -> server 1 RTT SYN SYN+ACK ACK
  • 23. 2002/11/12 Roberto Innocente 23 TCP circuit termination active close passive close TCP allows half-open connections. Each direction is independent. FIN 1 RTT FIN ACK ACK
  • 24. 2002/11/12 Roberto Innocente 24 T/TCP (Transaction TCP) /1 • Standard TCP provides a virtual circuit service with connection establishment, data transfer, circuit termination phases. • Some application however are built around a simpler transaction service : a client makes a request and the server answers.
  • 25. 2002/11/12 Roberto Innocente 25 T/TCP (Transaction TCP)/2 • For these applications the overhead of the 3- way initial TCP handshake should be avoided • Then T/TCP introduces a TCP accelerated open (TAO) • This TCP extension avoids 3-way handshake • Request and reply is sent together with connection messages • Servers cache a CC (connection count) for clients to detect duplicate requests. • Latency = RTT + server processing time
  • 26. 2002/11/12 Roberto Innocente 26 T/TCP (Transaction TCP)/3 SYN,FIN,data,ACK SYN,data,FIN • There are many concerns about the security of this protocol due to the easy flooding of servers that is possible spoofing the IP address ACK
  • 27. 2002/11/12 Roberto Innocente 27 Mobile and Satellite Links • As we have seen TCP interprets packet losses as a signal of congestion and therefore halves its sending rate. • This is frequently inappropriate for lossy channels as satellite links or wireless links. • There are essentially 2 kinds of proposals to overcome this problem : – to use a link layer protocol as ARQ that hides the real behaviour of the link making it as reliable as ordinary links : snoop – to split the tcp connection between the 2 different networks (using on the wireless network a special implementation or an ad hoc protocol) : Indirect TCP: I-TCP
  • 28. 2002/11/12 Roberto Innocente 28 Split TCP/ I-TCP (Indirect TCP) • First proposed in the mid ’90 • Based on the idea that a connection can be established over 2 completely different networks : wired and wireless • ACKs are not end-to- end and this breaks the end-2-end property of TCP GW std TCP Host Non std protocolMobile Host
  • 29. 2002/11/12 Roberto Innocente 29 HTTP • ver. 1.0 opens a TCP conection for each URI retrieved : retrieves first URI and then tries to get all connected URI using simultaneous separated TCP connections. Creates bursts on the net, incurs startup overhead • ver. 1.1 persistent connections and pipelining : good because only one transport connection and so it can be trained, but slow because it serializes transfers
  • 30. 2002/11/12 Roberto Innocente 30 Session Control Protocol (SCP) • devised by Simon Spero UNC • several common applications like ftp, http use for every transaction a separate TCP connection, while in fact many requests are done to the same server. This is inefficient and the applications incur all the startup costs • SCP is a simple protocol that lets a server and a client to have multiple conversations over a single TCP connection. • has a fixed overhead of 8 bytes per segment
  • 31. 2002/11/12 Roberto Innocente 31 Simple Multiplexing Protocol • Frequently referred as MUX (Gettys,Nielsen 1997) many ideas derived from SCP (session control protocol) • It separates the transport protocol as for example TCP from the upper level application protocol inserting a lightweight session multiplexing protocol (can manage up to 2**8 sessions) by which an application can have multiple ongoing (e.g. HTTP) transactions over a single transport level connection • Differently from SCP can have only 4 bytes of overhead, and can multiplex different protocols
  • 32. 2002/11/12 Roberto Innocente 32 TCB (TCP control blocks) interdependence • proposal to share TCP control block between the different connections to the same host, such that for example congestion windows, RTTs, ssthresh dont need to be recalculated and startup times can be avoided
  • 33. 2002/11/12 Roberto Innocente 33 Connection managers • There have been proposals to share TCP parameters (RTT, b/w available, ..) between more hosts (e.g. a company) setting up a server that caches those values for the different networks
  • 34. 2002/11/12 Roberto Innocente 34 AIMD / AIPD Congestion control • AIMD (Additive Increase Multiplicative Decrease) : – W = W + a (no loss) – W = W * (1 – b) (L > 0 losses) it achieves a b/w ∝1/sqrt(L) • AIPD (Additive Increase Proportional Decrease): – W = W + a (no loss) – W = W * (1 –b *L) (L > 0 losses) it achieves a b/w ∝1/L (ns2 plot of 3 flows,source Lee 2001)
  • 35. 2002/11/12 Roberto Innocente 35 TCP buffer tuning • The optimal TCP flow control window size is the bandwidth delay product for the link. In these years as network speeds have increased, o.s. have adjusted the default buffer size from 8KB to 64KB. This is far too small for hpn. An OC12 link with 50 ms rtt delay requires at least 3.75 MB of buffers ! There is a lot of reseach on mechanisms to automatically set an optimal flow control window and buffer size, just some examples : – Auto-tuning (Semke,Mahdavi,Mathis 1998) – Dynamic Right Sizing(DRS) : • Weigl, Feng 2001 : Dynamic Right-Sizing: A Simulation study . – Enable (Tierney et al LBNL 2001) : database of BW-delay products – Linux 2.4 autotuning/connection caching: controlled by the new kernel variables net.ipv4.tcp_wmem/tcp_rmem, the advertised receive window starts small and grows with each segment from the transmitter; tcp control info for a destination are cached for 10 minutes(cwnd,rtt,rttvar,sshthresh)
  • 36. 2002/11/12 Roberto Innocente 36 RED /1 (Random Early Detection) • Floyd, V.Jacobson 1993 • Detects incipient congestion and drops packets with a certain probability depending on the queue length. The drop probability increases linearly from 0 to maxp as the queue length increases from minth to maxth. When the queue length goes above maxth (maximum threshold) all pkts are dropped. • Used together with ECN this algorithm can just signal ECN capable connections, instead of dropping packets
  • 37. 2002/11/12 Roberto Innocente 37 RED /2 • It is now widely deployed • There are very good performance reports • Still an active area of research for possible modifications • Cisco implements its own WRED (Weighted Random Early Detection) that discards packets based also on precedence (provides separate threshold and weights for different IP precedences)
  • 38. 2002/11/12 Roberto Innocente 38 RED /3 • (from Van Jacobson, 1998) Traffic on a busy E1 (2 Mbit/s) Internet link. RED was turned on at 10.00 of the 2nd day, and from then utilization rised up to 100% and remained there steady.
  • 39. 2002/11/12 Roberto Innocente 39 ECN (Explicit Congestion Notification) /1 • RFC 3168 Ramakrishnan, Floyd, Black : The addition of Explicit Congestion Notification (ECN) to IP (Sep 2001) • Uses 2 bits of the IPv4 Tos (Now and in IPv6 reserved as a DiffServ codepoint). These 2 bits encode the states : – ECN-Capable Transport : ECT(0) and ECT(1) – Congestion Experienced (CE) • If the transport is TCP, uses 2 bits of the TCP header, next to the Urgent flag : – ECN-Echo(ECE) set in the first ACK after receiving a CE pkt – Congestion Window Reduced (CWR) set in the first pkt after having reduced cwnd in response to an ECE pkt • With TCP the ECN is initialized sending a SYN pkt with ECE and CWR on, and receiving a SYN+ACK pkt with ECE on
  • 40. 2002/11/12 Roberto Innocente 40 ECN /2 How it works: • senders set ECT(0) or ECT(1) to indicate that the end-nodes of the transport protocol are ECN capable • a router experiencing congestion, sets the 2 bits of ECT pkts to the CE state (instead of dropping the pkt) • the receiver of the pkt signals back the condition to the other end • the transports behave as in the case of a pkt drop (no more than once in a RTT) Linux adopted it on 2.4 ... many connection troubles (the default now is to be off, can be turned on/off using : /proc/sys/net/ipv4/tcp_ecn). Major sites are using firewalls or load balancers that refuse connections from ECT (Cisco PIX and Load Director, etc).
  • 41. 2002/11/12 Roberto Innocente 41 Bibliography /1 • Van Jacobson,M.Karel : Congestion avoidance and control, 1988 • R.Stevens: TCP/IP Illustrated vol.1 The protocols, 1994 • G.Wright, R.Stevens: TCP/IP Illustrated vol. 2 The Implementation, 1995 • RFC2581 TCP Congestion Control,Allman, Paxson, Stevens (1999) • RFC2018 TCP Selective Acknowledgements Options, Mathis,Mahdavi, Floyd 1996
  • 42. 2002/11/12 Roberto Innocente 42 Bibliography /2 • Brakmo, Peterson: TCP Vegas End to End congestion avoidance on a Global Internet, 1995 • M.Allman, On the generation and use of TCP Acknowledgments, 1999 • L.Peterson,S.Davies : Computer networks,2000 • Balakrishnan,Seshan : Improving TCP/IP performance over wireless networks, 1995 • Bakre,Badrinath: I-TCP: Indirect TCP for mobile host, 1994