2. Malathi Veeraraghavan
Originals by Jörg Liebeherr 2
Background on ARQ Error Control 1
• Two types of errors:
– Lost packets
– Damaged packets
• Error control schemes that involve error detection and
retransmission of lost or corrupted frames are referred to
as Automatic Repeat reQuest (ARQ) error control
• Most Error Control techniques are based on:
1. Error Detection Scheme (Parity checks, CRC).
2. Retransmission Scheme.
3. Malathi Veeraraghavan
Originals by Jörg Liebeherr 3
Background on ARQ Error Control 2
• All retransmission schemes use all or a subset of the
following procedures:
– Positive acknowledgments (ACK)
– Negative acknowledgment (NACK)
– Selective acknowledgment (SACK)
– All retransmission schemes (using ACK, NACK, SACK or
all) rely on the use of timers
• The most common ARQ retransmission schemes are:
Stop-and-Wait ARQ
Go-Back-N ARQ
Selective Repeat ARQ
4. Malathi Veeraraghavan
Originals by Jörg Liebeherr 4
Error Control in TCP
• TCP maintains multiple timers for each connection
• TCP couples error control and congestion control (I.e., it
assumes that errors are caused by congestion)
5. Malathi Veeraraghavan
Originals by Jörg Liebeherr 5
TCP Timers
• TCP maintains multiple timers:
– Retransmission Timer:
• The timer is started during a transmission. A timeout causes a
retransmission
– Persist Timer
• Ensures that window size information is transmitted even if no data
is transmitted
– Keepalive Timer
• Detects crashes on the other end of the connection
– Other timers
• Delayed ACK timer, timeout of connection setup, abort timeout
(total timeout - keeps retransmitting till this timeout, then it kills the
connection), 2MSL timeout (when closing connection)
6. Malathi Veeraraghavan
Originals by Jörg Liebeherr 6
TCP Retransmission Timer
• Retransmission Timer:
– The setting of the retransmission timer is crucial for
efficiency
– Timeout value too small -> results in unnecessary
retransmissions
– Timeout value too large -> long waiting time before a
retransmission can be issued
– A problem is that the delays in the network are not fixed
– Therefore, the retransmission timers must be adaptive
7. Malathi Veeraraghavan
Originals by Jörg Liebeherr 7
Measuring TCP Retransmission Timers
aida.poly.edu rigoletto.poly.edu
ftp session
from aida
to rigoletto
•Transfer file from aida to rigoletto
• Unplug Ethernet cable in the middle of file transfer
9. Malathi Veeraraghavan
Originals by Jörg Liebeherr 9
Interpreting the Measurements
• The interval between retransmission
attempts in seconds is:
1.03, 3, 6, 12, 24, 48, 64, 64, 64,
64, 64, 64, 64.
• Time between retransmissions is
doubled each time (Exponential
Backoff Algorithm)
• Timer is not increased beyond 64
seconds
• TCP gives up after 13th attempt
and 9 minutes (total timeout,
tcp_ip_abort_interval is 2 mins in
Solaris and can be programmed by
administrator - 9 mins is the
commonly used old timeout value)
0
100
200
300
400
500
600
Seconds
0
2
4
6
8
10
12
Transmission Attempts
10. Malathi Veeraraghavan
Originals by Jörg Liebeherr 10
TCP timers
• First timeout occurs based on when timer was intialized.
• This explains why the first timeout occurs at 1.03 sec and not 1.5.
• If the base timer clock is 500 ms, the first timeout occurs after 3 timer ticks. This
happens to occur at 1.03 sec after first segment was sent. Subsequent
retransmissions occur at 3 sec, 6 sec, 12 sec, etc.
somewhere
here TCP sends
first segment
500 ms
per tick
Retransmission timer
expires after three
ticks (<1.5 sec; in this
case it happens to be
1.03 sec)
Retransmission timer
expires after six ticks
(3 sec)
4
3
1 2 6
5 7 12
10
9
8 11
11. Malathi Veeraraghavan
Originals by Jörg Liebeherr 11
Adaptive mechanism
• The retransmission mechanism of TCP is adaptive
• The retransmission timers are set based on round-trip time (RTT) measurements
that TCP performs
Segment 1
Segment 4
ACK for Segment 1
Segment 2
Segment 3
ACK for Segment 2 + 3
Segment 5
ACK for Segment 4
ACK for Segment 5
RTT
#1
RTT
#2
RTT
#3
• The RTT is based on time
difference between segment
transmission and ACK
• But:
– TCP does not ACK each
segment
– Can’t start a second RTT
measurement if timing on
one segment is in progress
– Each connection has only
one timer
12. Malathi Veeraraghavan
Originals by Jörg Liebeherr 12
Computation of RTO in adaptive scheme
• Retransmission timer is set to a Retransmission Timeout (RTO) value.
• RTO is calculated based on the RTT measurements.
• The RTT measurements are smoothed by the following estimators A (mean RTT value) and
D (smoothed mean deviation of RTT):
Err = M - A
A A+ g Err=A(1-g)+gM
D D+ h (|Err|-D)=D(1-h)+ h|Err|
RTO = A + 4D (latest formula)
(book also says A+2D for initial value; we’ll use A+4D)
The gains are set to h=1/4 and g=1/8
– In the formula for computing the new smoothed mean RTT A, 0.125
times the newly measured value (M) is added to 0.875 times the old smoothed
value of A
13. Malathi Veeraraghavan
Originals by Jörg Liebeherr 13
Example of RTO computation (adaptive)
• Assume A=1, D=1 (initial values)
• Err = 2 -1 =1 (since M, the measured RTT is 2)
• A = 1 + 0.125×1= 1.125; D = 1+0.25 (1-1)=1
• RTO = A+4D=1.125+4 = 5.125
• This is why in the figure below when segment 2 is lost, it is
retransmitted after 5.125 sec. Segment 1
ACK for Segment 1
Segment 2
Segment 2 (retransmitted)
ACK for Segment 2 + 3
RTT
=2
X (packet lost)
RTO
=5.125
14. Malathi Veeraraghavan
Originals by Jörg Liebeherr 14
Karn’s Algorithm
• If an ACK for a retransmitted
segment is received, the sender
cannot tell if the ACK belongs to
the original or the
retransmission.
segment
ACK
retransmission
of segment
Timeout !
RTT
?
RTT
?
• Karn’s Algorithm:
– Don’t update A or D on any segments that have been
retransmitted.
15. Malathi Veeraraghavan
Originals by Jörg Liebeherr 15
RTO Calculation: Example
• At t1: RTO = 6 sec
• At t2: RTO= 2 * 6 = 12 sec
(exponential backoff)
• At t3: RTO is not updated
(Due to Karn’s algorithm)
RTT #1 RTT #3
S
e
g
m
e
n
t
1
A
C
K
f
o
r
S
e
g
m
e
n
t
1
S
Y
N
S
Y
N
Time-
out !
S
Y
N
+
A
C
K
S
e
g
m
e
n
t
2
S
e
g
m
e
n
t
3
A
C
K
f
o
r
S
e
g
m
e
n
t
2
A
C
K
f
o
r
S
e
g
m
e
n
t
3
RTT #2
S
e
g
m
e
n
t
4
.
S
e
g
m
e
n
t
5
.
S
e
g
m
e
n
t
6
.
A
C
K
f
o
r
S
e
g
m
e
n
t
4
t1 t2 t3
t4 t5t6 t7
t8
A
C
K
t9
16. Malathi Veeraraghavan
Originals by Jörg Liebeherr 16
Congestion control (Second topic of this
lecture)
• Most often, a packet loss in a network is due to an overflow at
a congested router (rather than due to a transmission error)
• A sender can detect lost packets through a:
• Timeout of a retransmission timer
• Receipt of a duplicate ACK
• TCP assumes that a packet loss is caused by congestion and
reduces the size of the sending window (cwnd)
• Algorithms that reduce and then reopen the sending window
as packets are lost:
– Congestion Avoidance
– Fast retransmit and Fast recovery
17. Malathi Veeraraghavan
Originals by Jörg Liebeherr 17
Recall Slow Start / Congestion Avoidance
• Here we give a recap of the normal operation of Slow Start
and Congestion Avoidance
If cwnd <= ssthresh then
/* Slow Start Phase */
Each time an ACK is received:
cwnd = cwnd + segsize
else /* cwnd > ssthresh */
/* Congestion Avoidance Phase */
Each time an ACK is received:
cwnd = cwnd + segsize * segsize / cwnd + segsize / 8
endif
18. Malathi Veeraraghavan
Originals by Jörg Liebeherr 18
Congestion Avoidance Algorithm
• When congestion occurs (indicated by timeout or receipt of
duplicate ACK),
– ssthresh is set to half the current window size (the
minimum of the advertised window (AW) and cwnd):
ssthresh = min(cwnd,AW) / 2 but at least 2 segments
– cwnd is changed according to:
cwnd = 1 segsize = 1 MSS bytes (in case of timeout only)
• When new data is acknowledged,cwnd is increased according
to whether it is in slow start or CA
19. Malathi Veeraraghavan
Originals by Jörg Liebeherr 19
Slow Start / Congestion Avoidance
• A typical plot of cwnd for a TCP connection (segsize = 1500
bytes) :
20. Malathi Veeraraghavan
Originals by Jörg Liebeherr 20
Accelerated retransmissions (Fast retransmit)
• TCP allows accelerated retransmissions (Fast Retransmit)
– If receiver gets a segment out of order, it sends an ack with
the expected sequence number. If sender receives one or
two duplicate ACKs, it thinks segments are misordered.
When expected segment is received at receiver, it sends
the correct ACK. But if the third duplicate ACK is received
at sender, it assumes lost segments and retransmits
immediately without waiting for expiry of retransmission
timer. Hence it is called fast retransmit.
21. Malathi Veeraraghavan
Originals by Jörg Liebeherr 21
Fast Retransmit and Fast Recovery
• After the third duplicate ACK
(meaning fourth ACK) is received
by the sender, it transmits a single
segment without waiting for a
timeout to expire.
Data (100:200)
ACK 100
ACK 100
ACK 100
Data (100:200)
ACK 100
• If 3rd duplicate ACK (this means fourth ACK with same ack no.) is received:
ssthresh = min(cwnd, receiver’s advertised window)/2
cwnd = ssthresh + 3 segsize; then retransmit segment
Reason: TCP receiver has to issue an ACK every time it receives a new segment.
Therefore when the sender receives 3 duplicate ACKs it implies that three
segments got through the network successfully; Therefore it inflates the cwnd.
• For each additional duplicate ACK received:
cwnd = cwnd + segsize
and transmit a segment if allowed by new value of cwnd
• When an ACK arrives that acknowledges new data set cwnd = ssthresh; (this should
be the ACK for the retransmission from step 1); additionally, it will ack intermediate
segments between lost packet and receipt of third duplicate ACK, so set cwnd = cwnd
+ segsize; now in CA phase
23. Malathi Veeraraghavan
Originals by Jörg Liebeherr 23
Example: computation of cwnd on previous
slide
• Upto and including ack 2561, this TCP connection is in slow
start, and cwnd is increased by 1 MSS bytes each time an
ACK is received.
• Note that when cwnd = ssthresh, slow start is still applied.
Hence when ack 2561 is received, cwnd = 2560+512 = 3072.
• When the last ack shown on the previous slide is received,
the TCP connection is in congestion avoidance since cwnd is
> ssthresh. Therefore, cwnd = cwnd + MSS × MSS / cwnd +
MSS / 8 = 3072 + 512 × 512/3072+512/8=3222
24. Malathi Veeraraghavan
Originals by Jörg Liebeherr 24
Example: RTO timeout (see congestion
avoidance algorithm)
• Example of a retransmit based on a timeout
• When segment is retransmitted, ssthresh is dropped to half of the
minimum of the cwnd and advertised window. Since advertised window is
5120 bytes for this example, half of 3222 is 1611, but this is rounded down
to the next multiple of the MSS (see page 316 for this rounding down
concept).
PSH
3073:3585(512)
ack 10
cwnd=3222; ssthresh=2560
cwnd=512; ssthresh=1536
X
PSH 3073:3585(512) ack 10
RTO expiry
25. Malathi Veeraraghavan
Originals by Jörg Liebeherr 25
Example: duplicate ACKs
(congestion avoidance algorithm and fast retransmit/recovery algorithm)
• In case of duplicate ACKs, both congestion avoidance algorithm and fast
retransmit/recovery algorithms apply
cwnd=3222; ssthresh=2560
PSH 3073:3585
(512) ack 10
X
PSH 3585:4097 (512) ack 10
PSH 4097:4609 (512) ack 10
PSH 4609:5121 (512) ack 10
ack 3073
ack 3073
ack 3073
ack 3073
cwnd=3222; ssthresh=1536
cwnd=3222; ssthresh=1536
cwnd=1536+3*512=3072; ssthresh=1536 PSH 3073:3585 (512) ack 10
ack 5121
cwnd=ssthresh=1536; ssthresh=1536;
cwnd=2048
•For reason for last cwnd increase to 2048, see last case in Fig. 21.11
26. Malathi Veeraraghavan
Originals by Jörg Liebeherr 26
Repacketization
• When TCP does a retransmission, it can send the missing data in
differently sized segments
• Increase segment size (if allowed by MSS limit) to improve efficiency
(new data arrives after first transmitted segment was lost)
Data (1:100)
ACK 100
ACK 300
Data (100:200)
lost
Data (100:300)
new data arrives from
application (100 bytes)
before the retransmission
timer times out
27. Malathi Veeraraghavan
Originals by Jörg Liebeherr 27
Persist Timer in TCP
• Assume the window size goes down to zero and the ACK that
opens the window gets lost
3K
2K SeqNo=0
Receiver
Buffer
0 4K
2K
AckNo=2048 Win=2048
ACK is
lost
2K SeqNo=2048
4K
AckNo=4096 Win=0
AckNo=4096 Win=1024
Sender
blocked
• If ACK (see figure) is
lost, both sides are
blocked.
• Persist Timer:
Forces the sender to
periodically query the
receiver about its window
size (window probes)
28. Malathi Veeraraghavan
Originals by Jörg Liebeherr 28
Persist Timer
• The persist timer is started by the sender when the sliding window is zero
• Persist timer uses exponential backoff (initial value is 1.5 seconds), but it
is bounded to the range [5 sec, 60sec]
• So the time interval between timeouts are at:
5, 5, 6, 12, 24, 48, 60, 60, …
– The first two are 5 because the first two timer values, 1.5 and 3,
are both increased to be within bound [5, 60]
• The window probe packet contains one byte of data
• TCP allows sender to send one byte beyond close of receiver window
• Persist timer never gives up (till connection gets aborted)
30. Malathi Veeraraghavan
Originals by Jörg Liebeherr 30
Keepalive Timer in TCP
• When a TCP connection has been idle for a long time, a
Keepalive timer reminds a station to check if the other side is
still there.
• A probe packet is sent if the connection has been idle for 2
hours
• Assume a probe has been sent from A to B:
(1) B is up and running: B responds with an ACK
(2) B has crashed and is down: A will send 10 more probes, each 75
seconds apart. If A does not get a
response, it will close the connection
(3) B has rebooted: B will send a RST segment
(4) B is up, but unreachable: Looks to A the same as (2)
32. Malathi Veeraraghavan
Originals by Jörg Liebeherr 32
TCP Summary Contd.
• Bulk TCP data transfer:
– Flow control: sliding window (receiver paces sender)
– Error control: time-outs and retransmissions
• exponential backoff (in case of retransmits)
• RTO changing adaptively to measured RTTs
• Karn’s algorithm
– Congestion control: congestion window (sender has window)
• Slow start and congestion avoidance phases (normal operation)
• Lost packets (timeout or duplicate ACKs)
– congestion avoidance algorithm
– fast retransmit and fast recovery algorithm
• Because of the congestion recovery schemes, TCP’s ARQ scheme is Go-
back-N if an error (loss) is detected by a retransmission time-out
occurs but selective repeat if an error (loss) is detected by triple
duplicate ACKs.
– Repacketization
• Persist and Keep-alive timers
33. Malathi Veeraraghavan
Originals by Jörg Liebeherr 33
Different schemes for determining RTO
• Exponential backoff if a segment is retransmitted
• adaptive RTO as a function of RTT (A+4D)
– RTT measurement is in progress and a new segment sent
then no RTT measurement is taken for new segment
• Karn’s algorithm
– no RTT measurement on retransmitted segment