Retransmission Tcp
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
3,939
On Slideshare
3,939
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
31
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Managing Retransmissions Using the Retransmission Queue The method for detecting lost segments and retransmitting them is conceptually simple. Each time we send a segment, we start a retransmission timer. This timer starts at a predetermined value and counts down over time. If the timer expires before an acknowledgment is received for the segment, we retransmit the segment. TCP uses this basic technique but implements it in a slightly different way. The reason for this is the need to efficiently deal with many segments that may be unacknowledged at once, to ensure that they are each retransmitted at the appropriate time if needed. The TCP system works according to the following specific sequence: o Placement On Retransmission Queue, Timer Start: As soon as a segment containing data is transmitted, a copy of the segment is placed in a data structure called the retransmission queue. A retransmission timer is started for the segment when it is placed on the queue. Thus, every segment is at some point placed in this queue. The queue is kept sorted by the time remaining in the retransmission timer, so the TCP software can keep track of which timers have the least time remaining before they expire. o Acknowledgment Processing: If an acknowledgment is received for a segment before its timer expires, the segment is removed from the retransmission queue. o Retransmission Timeout: If an acknowledgment is not received before the timer for a segment expires, a retransmission timeout occurs, and the segment is automatically retransmitted. Of course, we have no more guarantee that a retransmitted segment will be received than we had for the original segment. For this reason, after retransmitting a segment, it remains on the retransmission queue. The retransmission timer is reset, and the countdown begins again. Hopefully an acknowledgment will be received for the retransmission, but if not, the segment will be retransmitted again and the process repeated. Certain conditions may cause even repeated retransmissions of a segment to fail. We don't want TCP to just keep retransmitting forever, so TCP will only retransmit a lost segment a certain number of times before concluding that there is a problem and terminating the connection. Key Concept: To provide basic reliability for sent data, each device’s TCP implementation uses a retransmission queue. Each sent segment is placed on the queue and a retransmission timer started for it. When an acknowledgment is received for the data in the segment, it is removed from the retransmission queue. If the timer goes off before an acknowledgment is received the segment is retransmitted and the timer restarted.
  • 2. TCP uses a cumulative acknowledgment system. The Acknowledgment Number field in a segment received by a device indicates that all bytes of data with sequence numbers less than that value have been successfully received by the other device. A segment is considered acknowledged when all of its bytes have been acknowledged; in other words, when an Acknowledgment Number containing a value larger than the sequence number of its last byte is received. Policies For Dealing with Retransmission When Unacknowledged Segments Exist This then leads to an important question: how do we handle retransmissions when there are subsequent segments outstanding beyond the lost segment? In our example above, when the server experiences a retransmission timeout on Segment #3, it must decide what to do about Segment #4, when it simply doesn't know whether or not the client received it. In our “worst-case scenario”, we have 19 segments that may or may not have shown up at the client after the first one that was lost. We have two different possible ways to handle this situation. Retransmit Only Timed-Out Segments This is the more “conservative”, or if you prefer, “optimistic” approach. We retransmit only the segment that timed out, hoping that the other segments beyond it were successfully received. This method is best if the segments after the timed-out segment actually showed up. It doesn't work so well if they did not. In the latter case, each segment would have to time out individually and be retransmitted. Imagine that in our “worst- case scenario” that all 20 500-byte segments were lost. We would have to wait for Segment #1 to time out and be retransmitted. This retransmission would be acknowledged (hopefully) but then we would get stuck waiting for Segment #2 to time out and be resent. We would have to do this many times. Retransmit All Outstanding Segments This is the more “aggressive” or “pessimistic” method. Whenever a segment times out we re-send not only it but all other segments that are still unacknowledged. This method ensures that any time there is a hold up with acknowledgments, we “refresh” all outstanding segments to give the other device an extra chance at receiving them in case they too were lost. In the case where all 20 segments were lost, this saves substantial amounts of time over the “optimistic” approach. The problem here is that these retransmissions may not be necessary. If the first of 20 segments was lost and the other 19 were actually received, we'd be re- sending 9,500 bytes of data (plus headers) for no reason.
  • 3. The optional TCP selective acknowledgment feature provides a more elegant way of handling subsequent segments when a retransmission timer expires. When a device receives a non- contiguous segment it includes a special Selective Acknowledgment (SACK) option in its regular acknowledgment that identifies non-contiguous segments that have already been received, even if they are not yet acknowledged. This saves the original sender from having to retransmit them. TCP Adaptive Retransmission and Retransmission Timer Calculations (Page 1 of 3) Whenever a TCP segment is transmitted, a copy of it is also placed on the retransmission queue. When the segment is placed on the queue, a retransmission timer is started for the segment, which starts from a particular value and counts down to zero. It is this timer that controls how long a segment can remain unacknowledged before the sender gives up, concludes that it is lost and sends it again. The length of time we use for retransmission timer is thus very important. If it is set too low, we might start retransmitting a segment that was actually received, because we didn't wait long enough for the acknowledgment of that segment to arrive. Conversely, if we set the timer too long, we waste time waiting for an acknowledgment that will never arrive, reducing overall performance. Difficulties in Choosing the Duration of the Retransmission Timer Ideally, we would like to set the retransmission timer to a value just slightly larger than the round-trip time (RTT) between the two TCP devices, that is, the typical time it takes to send a segment from a client to a server and the server to send an acknowledgment back to the client (or the other way around, of course). The problem is that there is no such “typical” round-trip time. There are two main reasons for this: o Differences In Connection Distance: Suppose you are at work in the United States, and during your lunch hour you are transferring a large file between your workstation and a local server connection using 100 Mbps Fast Ethernet, at the same time you are downloading a picture of your nephew from your sister's personal Web site—which is connected to the Internet using an analog modem to an ISP in a small town near Lima, Peru. Would you want both of these TCP connections to use the same retransmission timer value? I certainly hope not! o Transient Delays and Variability: The amount of time it takes to send data between any two devices will vary over time due to various happenings on the internetwork: fluctuations in traffic, router loads and so on. To see an example of this for yourself, try typing “ping www.tcpipguide.com” from the command line of an Internet-connected PC and you'll see how the reported times can vary. Adaptive Retransmission Based on Round-Trip Time Calculations
  • 4. It is for these reasons that TCP does not attempt to use a static, single number for its retransmission timers. Instead, TCP uses a dynamic, or adaptive retransmission scheme. TCP attempts to determine the approximate round-trip time between the devices, and adjusts it over time to compensate for increases or decreases in the average delay. The practical issues of how this is done are important, but are not covered in much detail in the main TCP standard. RFC 2988, Computing TCP's Retransmission Timer, discusses the issue extensively. Round-trip times can “bounce” up and down, as we have seen, so we want to aim for an average RTT value for the connection. This average should respond to consistent movement up or down in the RTT without overreacting to a few very slow or fast acknowledgments. To allow this to happen, the RTT calculation uses a smoothing formula: New RTT = (α * Old RTT) + ( (1-α) * Newest RTT Measurement) Where “α” (alpha) is a smoothing factor between 0 and 1. Higher values of “α∀ (closer to 1) provide better smoothing and avoiding sudden changes as a result of one very fast or very slow RTT measurement. Conversely, this also slows down how quickly TCP reacts to more sustained changes in round-trip time. Lower values of alpha (closer to 0) make the RTT change more quickly in reaction to changes in measured RTT, but can cause “over-reaction” when RTTs fluctuate wildly. Acknowledgment Ambiguity Measuring the round-trip time between two devices is simple in concept: note the time that a segment is sent, note the time that an acknowledgment is received, and subtract the two. The measurement is more tricky in actual implementation, however. One of the main potential “gotchas” occurs when a segment is assumed lost and is retransmitted. The retransmitted segment carries nothing that distinguishes it from the original. When an acknowledgment is received for this segment, it's unclear as to whether this corresponds to the retransmission or the original segment. (Even though we decided the segment was lost and retransmitted it, it's possible the segment eventually got there, after taking a long time; or that the segment got their quickly but the acknowledgment took a long time!) This is called acknowledgment ambiguity, and is not trivial to solve. We can't just decide to assume that an acknowledgment always goes with the oldest copy of the segment sent, because this makes the round-trip time appear too high. We also don't want to just assume an acknowledgment always goes with the latest sending of the segment, as that may artificially lower the average round-trip time Refinements to RTT Calculation and Karn's Algorithm
  • 5. TCP's solution to round-trip time calculation is based on the use of a technique called Karn's algorithm, after its inventor, Phil Karn. The main change this algorithm makes is the separation of the calculation of average round-trip time from the calculation of the value to use for timers on retransmitted segments. The first change made under Karn's algorithm is to not use measured round-trip times for any segments that are retransmitted in the calculation of the overall average round-trip time for the connection. This completely eliminates the problem of acknowledgment ambiguity. However, this by itself would not allow increased delays due to retransmissions to affect the average round-trip time. For this, we need the second change: incorporation of a timer backoff scheme for retransmitted segments. We start by setting the retransmission timer for each newly-transmitted segme based on the current average round-trip time. When a segment is retransntmitted, the timer is not reset to the same value it was set for the initial transmission. It is “backed off” (increased) using a multiplier (typically 2) to give the retransmission more time to be received. The timer continues to be increased until a retransmission is successful, up to a certain maximum value. This prevents retransmissions from being sent too quickly and further adding to network congestion. Once the retransmission succeeds, the round-trip timer is kept at the longer (backed-off) value until a valid round-trip time can be measured on a segment that is sent and acknowledged without retransmission. This permits a device to respond with longer timers to occasional circumstances that cause delays to persist for a period of time on a connection, while eventually having the round-trip time settle back to a long-term average when normal conditions resume. Reducing Send Window Size To Reduce The Rate Data Is Sent Let's go back to our earlier example so I can hopefully explain better what I mean, but let’s make a few changes. First, to keep things simple, let’s just look at the transmissions made from the client to the server, not the server’s replies (other than acknowledgments)—this is illustrated in Figure 222. As before, the client sends 140 bytes to the server. After sending the 140 bytes, the client has 220 bytes remaining in its usable window—360 in the send window less the 140 bytes it just sent. Sometime later, the server receives the 140 bytes and puts them in the buffer. Now, in an “ideal world”, the 140 bytes go into the buffer, are acknowledged and immediately removed from the buffer. Another way of thinking of this is that the buffer is of “infinite size” and can hold as much as the client can send. The buffer's free space remains 360 bytes in size, so the same window size can be advertised back to the client. This was the “simplification” in the previous example.
  • 6. As long as the server can process the data as fast as it comes in, it will keep the window size at 360 bytes. The client, upon receipt of the acknowledgment of 140 bytes and the same window size it had before, “slides” the full 360-byte window 140 bytes to the right. Since there are now 0 unacknowledged bytes, the client can now once again send 360 bytes of data. These correspond to the 220 bytes that were formerly in the usable window, plus 140 new bytes for the ones that were just acknowledged. In the “real world”, however, that server might be dealing with dozens, hundreds or even thousands of TCP connections. The TCP might not be able to process the data immediately. Alternately, it is possible the application itself might not be ready for the 140 bytes for whatever reason. In either case, the server's TCP may not be able to immediately remove all 140 bytes from the buffer. If so, upon sending an acknowledgment back to the client, will want to change the window size that it advertises to the client, to reflect the fact that the buffer is partially filled. Suppose that we receive 140 bytes as above, but are able to send only 40 bytes to the application, leaving 100 bytes in the buffer. When we send back the acknowledgment for the 140 bytes, the server can reduce its send window by 100, to 260. When the client receives this segment from the server it will see the acknowledgment of the 140 bytes sent and slide its window 140 bytes to the right. However, as it slides this window, it reduces its size to only 260 bytes. We can consider this like sliding the left edge of the window 140 bytes, but the right edge only 40 bytes. The new, smaller window ensures that the server receives a maximum of 260 bytes from the client, which will fit in the 260 bytes remaining in its receive buffer. This is illustrated in the first exchange of messages (Steps #1 through #3) at the top of Figure 226. Reducing Send Window Size To Stop The Sending of New Data What if the server is so bogged down that it can't process any of the bytes received? Let’s suppose that the next transmission from the client is 180 bytes in size, but the server is so busy it can’t remove any of them. It could buffer the 180 bytes and in the acknowledgment it sends for those bytes, reduce the window size by the same amount: from 260 to 80. When the client received the acknowledgment for 180 bytes it would see the window size had reduced by 180 bytes as well. It would “slide” its window by the same amount as the window size was reduced! This is effectively like the server saying “I acknowledge receipt of 180 bytes, but I am not allowing you to send any new bytes to replace them”. Another way of looking at this is that the left edge of the window slides 180 bytes while the right edge remained fixed. And as long as the right edge of the window doesn't move, the client can't send any more data than it could before receipt of the acknowledgment. This is the middle exchange (Steps #4 to #6) in Figure 226.
  • 7. The TCP sliding window system is used not just for ensuring reliability through acknowledgments and retransmissions—it is also the basis for TCP’s flow control mechanism. By increasing or reducing the size of its receive window, a device can raise or lower the rate at which its connection partner sends it data. In the case where a device becomes extremely busy, it can even reduce the receive window to zero, closing it; this will halt any further transmissions of data until the window is reopened. Problem of shrinking window What if the server were so overloaded that we actually needed to reduce the size of the buffer itself? Say memory was short and the operating system said “I know you have 360 bytes allocated for the receive buffer for this connection, but I need to free up memory so now you only have 240”. The server still can't immediately process the 140 bytes it received, so it would need to drop the window size it sent back to the client all the way from 360 bytes down to 100 (240 in the total buffer less the 140 already received). In effect, doing this actually moves the right edge of the client's send window back to the left. It says “not only can't you send more data when you receive this acknowledgment, but you now can send less”. In TCP parlance, this is called shrinking the window. There's a very serious problem with doing this, however: while the original 140 bytes were in transit from the client to the server, the client still thought it had 360 bytes of total window, of which 220 bytes were usable (360 less 140). The client may well have already sent some of that 220 bytes of data to the server before it gets notification that the server has shrunk the window! If so, and the server reduces its buffer to 240 bytes with 140 used, when those 220 bytes show up at the server, only 100 will fit and any additional ones will need to be discarded. This will force the client to have to retransmit that data, which is inefficient. Figure 227 illustrates graphically how this situation would play out. A phenomenon called shrinking the window occurs when a device reduces its receive window so much that its partner device’s usable transmit window shrinks in size (meaning that the right edge of its send window moves to the left). Since this can result in data already in transit having to be discarded, devices must instead reduce their receive window size more gradually. A device that reduces its receive window to zero is said to have closed the window. The other device’s send window is thus closed; it may not send regular data segments. It may, however, send probe segments to check the status of the window, thus making sure it does not miss notification when the window reopens. The MSS parameter ensures that we don't send segments that are too large— TCP is not allowed to create a segment larger than the MSS. Unfortunately, the basic sliding windows mechanism doesn't provide any minimum size of segment that can be transmitted. In fact, not only is it possible for a device to send very small, inefficient segments, the simplest implementation of flow control using
  • 8. unrestricted window size adjustments ensures that under conditions of heavy load, window size will become small, leading to significant performance reduction! How Silly Window Syndrome Occurs To see how this can happen, let's consider an example that is a variation on the one we’ve been using so far in this section. We'll assume the MSS is 360 and a client/server pair where again, the server's initial receive window is set to this same value, 360. This means the client can send a “full-sized” segment to the server. As long as the server can keep removing the data from the buffer as fast as the client sends it, we should have no problem. (In reality the buffer size would normally be larger than the MSS.) Now, imagine that instead, the server is bogged down for whatever reason while the client needs to send it a great deal of data. For simplicity, let's say that the server is only able to remove 1 byte of data from the buffer for every 3 it receives. Let's say it also removes 40 additional bytes from the buffer during the time it takes for the next client's segment to arrive. Here's what will happen: 1. The client's send window is 360, and it has lots of data to send. It immediately sends a 360 byte segment to the server. This uses up its entire send window. 2. When the server gets this segment it acknowledges it. However, it can only remove 120 bytes so the server reduces the window size from 360 to 120. It sends this in the Window field of the acknowledgment. 3. The client receives an acknowledgment of 360 bytes, and sees that the window size has been reduced to 120. It wants to send its data as soon as possible, so it sends off a 120 byte segment. 4. The server has removed 40 more bytes from the buffer by the time the 120-byte segment arrives. The buffer thus contains 200 bytes (240 from the first segment, less the 40 removed). The server is able to immediately process one-third of those 120 bytes, or 40 bytes. This means 80 bytes are added to the 200 that already remain in the buffer, so 280 bytes are used up. The server must reduce the window size to 80 bytes. 5. The client will see this reduced window size and send an 80-byte segment. 6. The server started with 280 bytes and removed 40 to yield 240 bytes left. It receives 80 bytes from the client, removes one third, so 53 are added to the buffer, which becomes 293 bytes. It reduces the window size to 67 bytes (360-293). This process, which is illustrated in Figure 228, will continue for many rounds, with the window size getting smaller and smaller, especially if the server gets even more overloaded. Its rate of clearing the buffer may decrease even more, and the window may close entirely. The basic TCP sliding window system sets no minimum size on transmitted segments. Under certain circumstances, this can result in a situation where many small, inefficient segments are sent, rather than a smaller number of large ones. Affectionately termed silly window syndrome
  • 9. (SWS), this phenomenon can occur either as a result of a recipient advertising window sizes that are too small, or a transmitter being too aggressive in immediately sending out very small amounts of data.