42. Transmission Media
Transmission Media
42
Transmission medium:: the physical path
between transmitter and receiver.
Repeaters or amplifiers may be used to extend
the length of the medium.
Communication of electromagnetic waves is
guided or unguided.
Guided media :: waves are guided along a physical
path (e.g, twisted pair, coaxial cable and optical
fiber).
Unguided media:: means for transmitting but not
guiding electromagnetic waves (e.g., the atmosphere
and outer space).
45. Transmission Media
Twisted Pair
45
Two insulated wires arranged in a spiral pattern
Copper or steel coated with copper
The signal is transmitted through one wire and a
ground reference is transmitted in the other wire.
Typically twisted pair is installed in building
telephone wiring.
Local loop connection to central telephone
exchange is twisted pair.
46. Transmission Media
Twisted Pair
46
Limited in distance, bandwidth and data rate due
to problems with attenuation, interference and
noise
Issue: cross-talk due to interference from other signals
“shielding” wire (shielded twisted pair (STP)) with
metallic braid or sheathing reduces interference.
“twisting” reduces low-frequency interference and
crosstalk.
47. UTP (Unshielded Twisted Pair)
Category 3 corresponds to ordinary voice-grade twisted pair
found in abundance in most office buildings.
Category 5 (used for Fast Ethernet) is much more tightly
twisted.
Transmission Media 47
48. Digital Subscriber Line (DSL) [LG&W
Transmission Media
48
p.137]
Telphone companies originally transmitted within
the 0 to 4kHZ range to reduce crosstalk. Loading
coils were added within the subscriber loop to
provide a flatter transfer function to further
improve voice transmission within the 3kHZ band
while increasing attenuation at the higher
frequencies.
ADSL (Asymmetric Digital Subscriber Line)
Uses existing twisted pair lines to provide higher
bit rates that are possible with unloaded twisted
pairs (i.e., no loading coils on subscriber loop.)
49. Transmission Media
DSL
49
the network transmits downstream at speeds
ranging from 1.536 Mbps to 6.144Mbps
asymmetric
bidirectional
digital transmisssions
users transmit upstream at speeds ranging
[higher frequencies] from 64 kbps to 640 kbps
0 to 4kHZ used for conventional analog telephone signals
50. Transmission Media
DSL
50
ITU-T G992.1 ADSL standard uses Discrete
Multitone (DMT) that divides the bandwidth into
a large number of small subchannels.
A splitter is required to separate voice signals
from the data signal.
The binary information is distributed among the
subchannels. Each subchannel uses QAM.
DMT adapts to line conditions by avoiding
subchannels with poor SNR.
53. Transmission Media
Coaxial Cable
53
Discussion divided into two basic categories
for coax used in LANs:
50-ohm cable [baseband]
75-ohm cable [broadband or single channel
baseband]
In general, coax has better noise immunity for
higher frequencies than twisted pair.
Coaxial cable provides much higher
bandwidth than twisted pair.
However, cable is ‘bulky’.
54. Transmission Media
Baseband Coax
54
50-ohm cable is used exclusively for digital
transmissions
Uses Manchester encoding, geographical limit is a
few kilometers.
10Base5 Thick Ethernet :: thick (10 mm) coax
10 Mbps, 500 m. max segment length, 100
devices/segment, awkward to handle and install.
10Base2 Thin Ethernet :: thin (5 mm) coax
10 Mbps, 185 m. max segment length, 30
devices/segment, easier to handle, uses T-shaped
connectors.
55. Transmission Media
Broadband Coax
55
75-ohm cable (CATV system standard)
Used for both analog and digital signaling.
Analog signaling – frequencies up to 500 MHZ are
possible.
When FDM used, referred to as broadband.
For long-distance transmission of analog signals,
amplifiers are needed every few kilometers.
57. Transmission Media
Optical Fiber
57
Optical fiber :: a thin flexible medium capable of
conducting optical rays. Optical fiber consists of a
very fine cylinder of glass (core) surrounded by
concentric layers of glass (cladding).
a signal-encoded beam of light (a fluctuating
beam) is transmitted by total internal reflection.
Total internal reflection occurs in the core
because it has a higher optical density (index of
refraction) than the cladding.
Attenuation in the fiber can be kept low by
controlling the impurities in the glass.
59. Transmission Media
Optical Fiber
59
Lowest signal losses are for ultrapure fused silica –
but this is hard to manufacture.
Optical fiber acts as a wavelength guide for
frequencies in the range 10 **14 to 10 **15 HZ which
covers the visible and part of the infrared spectrum.
Three standard wavelengths : 850 nanometers (nm.),
1300 nm, 1500 nm.
First-generation optical fiber :: 850 nm, 10’s Mbps
using LED (light-emitting diode) sources.
Second and third generation optical fiber :: 1300 and
1500 nm using ILD (injection laser diode) sources,
gigabits/sec.
60. Transmission Media
Optical Fiber
60
Attenuation loss is lower at higher wavelengths.
There are two types of detectors used at the
receiving end to convert light into electrical
energy (photo diodes):
PIN detectors – less expensive, less sensitive
APD detectors
ASK is commonly used to transmit digital data
over optical fiber {referred to as intensity
modulation}.
61. Transmission Media
Optica6l1 Fiber
Three techniques:
Multimode step-index
Multimode graded-index
Single-mode step-index
Presence of multiple paths differences in
delay optical rays interfere with each other.
A narrow core can create a single direct path
which yields higher speeds.
WDM (Wavelength Division Multiplexing)
yields more available capacity.
65. 5: DataLink Layer
Link Layer: Introduction
65
Some terminology:
hosts and routers are nodes
(bridges and switches too)
communication channels
that connect adjacent nodes
along communication path
are links
wired links
wireless links
LANs
2-PDU is a frame,
encapsulates datagram
“link”
data-link layer has responsibility of
transferring datagram from one node
to adjacent node over a link
66. 5: DataLink Layer
Link layer: context
66
Datagram transferred
by different link
protocols over
different links:
e.g., Ethernet on first
link, frame relay on
intermediate links, 802.11
on last link
Each link protocol
provides different
services
e.g., may or may not
provide rdt over link
transportation analogy
trip from Princeton to
Lausanne
limo: Princeton to JFK
plane: JFK to Geneva
train: Geneva to Lausanne
tourist = datagram
transport segment =
communication link
transportation mode =
link layer protocol
travel agent = routing
algorithm
67. 5: DataLink Layer
Link Layer Services
5a-
67
Framing, link access:
encapsulate datagram into frame, adding header,
trailer
channel access if shared medium
‘physical addresses’ used in frame headers to identify
source, dest
different from IP address!
Reliable delivery between adjacent nodes
we learned how to do this already (chapter 3)!
seldom used on low bit error link (fiber, some twisted
pair)
wireless links: high error rates
Q: why both link-level and end-end reliability?
68. 5: DataLink Layer
Link Layer Services (more)
5a-
68
Flow Control:
pacing between adjacent sending and receiving nodes
Error Detection:
errors caused by signal attenuation, noise.
receiver detects presence of errors:
signals sender for retransmission or drops frame
Error Correction:
receiver identifies and corrects bit error(s) without
resorting to retransmission
Half-duplex and full-duplex
with half duplex, nodes at both ends of link can transmit,
but not at same time
69. 5: DataLink Layer
Adaptors Communicating
5a-
69
link layer protocol
link layer implemented
in “adaptor” (aka NIC)
Ethernet card, PCMCI
card, 802.11 card
sending side:
encapsulates datagram in
a frame
adds error checking bits,
rdt, flow control, etc.
receiving side
looks for errors, rdt, flow
control, etc
extracts datagram,
passes to rcving node
adapter is semi-autonomous
link & physical layers
sending
node
frame
rcving
node
datagram
frame
adapter adapter
70. 5: DataLink Layer
Error Detection
5a-
70
EDC= Error Detection and Correction bits (redundancy)
D = Data protected by error checking, may include header
fields
• Error detection not 100% reliable!
• protocol may miss some errors, but rarely
• larger EDC field yields better detection and correction
71. Parity Checking
5: DataLink Layer
5a-
71
Single Bit
Parity:
Detect single bit errors
Two Dimensional Bit Parity:
Detect and correct single bit errors
0 0
72. Goal: detect “errors” (e.g., flipped bits) in
transmitted segment (note: used at transport
layer only)
5: DataLink Layer
Internet checksum
5a-
72
Sender:
treat segment contents
as sequence of 16-bit
integers
checksum: addition (1’s
complement sum) of
segment contents
sender puts checksum
value into UDP
checksum field
Receiver:
compute checksum of
received segment
check if computed checksum
equals checksum field value:
NO - error detected
YES - no error detected. But
maybe errors nonetheless?
More later ….
73. Checksumming: Cyclic Redundancy
5: DataLink Layer
Check
5a-
73
view data bits, D, as a binary number
choose r+1 bit pattern (generator), G
goal: choose r CRC bits, R, such that
<D,R> exactly divisible by G (modulo 2)
receiver knows G, divides <D,R> by G. If non-zero
remainder: error detected!
can detect all burst errors less than r+1 bits
widely used in practice (ATM, HDCL)
74. 5: DataLink Layer
CRC Example
5a-
74
Want:
D.2r XOR R = nG
equivalently:
D.2r = nG XOR R
equivalently:
if we divide D.2r by
G, want remainder
R
D.2r
G
R= remainder[ ]
75. Multiple Access Links and Protocols
Two types of “links”:
5: DataLink Layer
5a-
75
point-to-point
PPP for dial-up access
point-to-point link between Ethernet switch and host
broadcast (shared wire or medium)
traditional Ethernet
upstream HFC
802.11 wireless LAN
76. 5: DataLink Layer
Multiple Access protocols
5a-
76
single shared broadcast channel
two or more simultaneous transmissions by
nodes: interference
only one node can send successfully at a time
multiple access protocol
distributed algorithm that determines how nodes
share channel, i.e., determine when node can
transmit
communication about channel sharing must use
channel itself!
what to look for in multiple access protocols:
77. Ideal Mulitple Access Protocol
5: DataLink Layer
5a-
77
Broadcast channel of rate R bps
1. When one node wants to transmit, it can send at
rate R.
2. When M nodes want to transmit, each can send at
average rate R/M
3. Fully decentralized:
no special node to coordinate transmissions
no synchronization of clocks, slots
4. Simple
78. 5: DataLink Layer
MAC Protocols: a taxonomy
5a-
78
Three broad classes:
Channel Partitioning
divide channel into smaller “pieces” (time slots,
frequency, code)
allocate piece to node for exclusive use
Random Access
channel not divided, allow collisions
“recover” from collisions
“Taking turns”
tightly coordinate shared access to avoid collisions
79. Channel Partitioning MAC protocols: TDMA
5: DataLink Layer
5a-
79
TDMA: time division multiple access
access to channel in "rounds"
each station gets fixed length slot (length = pkt trans time) in each round
unused slots go idle
example: 6-station LAN, 1,3,4 have pkt, slots 2,5,6 idle
TDM (Time Division Multiplexing): channel divided into N time
slots, one per user; inefficient with low duty cycle users and at
light load.
FDM (Frequency Division Multiplexing): frequency subdivided.
80. Channel Partitioning MAC protocols:
5: DataLink Layer
FDMA
5a-
80
FDMA: frequency division multiple access
channel spectrum divided into frequency bands
each station assigned fixed frequency band
unused transmission time in frequency bands go idle
example: 6-station LAN, 1,3,4 have pkt, frequency bands 2,5,6 idle
frequency bands
TDM (Time Division Multiplexing): channel divided into N time slots,
one per user; inefficient with low duty cycle users and at light load.
FDM (Frequency Division Multiplexing): frequency subdivided.
81. Channel Partitioning (CDMA)
5: DataLink Layer
5a-
81
CDMA (Code Division Multiple Access)
unique “code” assigned to each user; i.e., code set
partitioning
used mostly in wireless broadcast channels (cellular,
satellite, etc)
all users share same frequency, but each user has own
“chipping” sequence (i.e., code) to encode data
encoded signal = (original data) X (chipping sequence)
decoding: inner-product of encoded signal and chipping
sequence
allows multiple users to “coexist” and transmit
simultaneously with minimal interference (if codes are
“orthogonal”)
84. 5: DataLink Layer
Random Access Protocols
5a-
84
When node has packet to send
transmit at full channel data rate R.
no a priori coordination among nodes
two or more transmitting nodes -> “collision”,
random access MAC protocol specifies:
how to detect collisions
how to recover from collisions (e.g., via delayed
retransmissions)
Examples of random access MAC protocols:
slotted ALOHA
ALOHA
CSMA, CSMA/CD, CSMA/CA
85. 5: DataLink Layer
Slotted ALOHA
5a-
85
Assumptions
all frames same size
time is divided into
equal size slots, time to
transmit 1 frame
nodes start to transmit
frames only at
beginning of slots
nodes are synchronized
if 2 or more nodes
transmit in slot, all
nodes detect collision
Operation
when node obtains fresh
frame, it transmits in
next slot
no collision, node can
send new frame in next
slot
if collision, node
retransmits frame in
each subsequent slot
with prob. p until
success
86. 5: DataLink Layer
Slotted ALOHA
5a-
86
Pros
single active node can
continuously transmit
at full rate of channel
highly decentralized:
only slots in nodes
need to be in sync
simple
Cons
collisions, wasting
slots
idle slots
nodes may be able to
detect collision in less
than time to transmit
packet
87. Slotted Aloha efficiency
5: DataLink Layer
5a-
87
Suppose N nodes with
many frames to send, each
transmits in slot with
probability p
prob that 1st node has
success in a slot = p(1-
p)N-1
prob that any node has a
success = Np(1-p)N-1
For max efficiency
with N nodes, find p*
that maximizes
Np(1-p)N-1
For many nodes,
take limit of Np*(1-
p*)N-1 as N goes to
infinity, gives 1/e =
.37
Efficiency is the long-run
fraction of successful slots
when there’s many nodes,
each
with many frames to send
At best: channel
used for useful
transmissions 37%
of time!
88. 5: DataLink Layer
Pure (unslotted) ALOHA
5a-
88
unslotted Aloha: simpler, no synchronization
when frame first arrives
transmit immediately
collision probability increases:
frame sent at t0 collides with other frames sent in [t0-
1,t0+1]
89. 5: DataLink Layer
Pure Aloha efficiency
5a-
89
P(success by given node) = P(node transmits) .
P(no other node transmits in [p0-1,p0] .
P(no other node transmits in [p0-1,p0]
= p . (1-p)N-1 . (1-p)N-1
= p . (1-p)2(N-1)
… choosing optimum p and then letting n -> infty ...
= 1/(2e) = .18
Even worse !
90. CSMA (Carrier Sense Multiple Access)
5: DataLink Layer
5a-
90
CSMA: listen before transmit:
If channel sensed idle: transmit entire frame
If channel sensed busy, defer transmission
Human analogy: don’t interrupt others!
91. 5: DataLink Layer
CSMA collisions
5a-
91
collisions can still
occur:
propagation delay means
two nodes may not hear
each other’s transmission
collision:
entire packet
transmission
time wasted
spatial layout of nodes
note:
role of distance & propagation
delay in determining collision
probability
92. CSMA/CD (Collision Detection)
5: DataLink Layer
5a-
92
CSMA/CD: carrier sensing, deferral as in
CSMA
collisions detected within short time
colliding transmissions aborted, reducing channel
wastage
collision detection:
easy in wired LANs: measure signal strengths,
compare transmitted, received signals
difficult in wireless LANs: receiver shut off while
transmitting
human analogy: the polite conversationalist
94. “Taking Turns” MAC protocols
5: DataLink Layer
5a-
94
channel partitioning MAC protocols:
share channel efficiently and fairly at high load
inefficient at low load: delay in channel access, 1/N
bandwidth allocated even if only 1 active node!
Random access MAC protocols
efficient at low load: single node can fully utilize
channel
high load: collision overhead
“taking turns” protocols
look for best of both worlds!
95. “Taking Turns” MAC protocols
5: DataLink Layer
5a-
95
Polling:
master node
“invites” slave
nodes to transmit
in turn
concerns:
polling overhead
latency
single point of
failure (master)
Token passing:
control token passed
from one node to next
sequentially.
token message
concerns:
token overhead
latency
single point of failure
(token)
96. 5: DataLink Layer
Summary of MAC protocols
5a-
96
What do you do with a shared media?
Channel Partitioning, by time, frequency or code
Time Division,Code Division, Frequency Division
Random partitioning (dynamic),
ALOHA, S-ALOHA, CSMA, CSMA/CD
carrier sensing: easy in some technologies (wire), hard in
others (wireless)
CSMA/CD used in Ethernet
Taking Turns
polling from a central site, token passing
97. 5: DataLink Layer
LAN technologies
5a-
97
Data link layer so far:
services, error detection/correction, multiple
access
Next: LAN technologies
addressing
Ethernet
hubs, bridges, switches
802.11
PPP
ATM
98. 5: DataLink Layer
LAN Addresses and ARP
5a-
98
32-bit IP address:
network-layer address
used to get datagram to destination IP network (recall IP
network definition)
LAN (or MAC or physical or Ethernet) address:
used to get datagram from one interface to another
physically-connected interface (same network)
48 bit MAC address (for most LANs)
burned in the adapter ROM
99. 5: DataLink Layer
LAN Addresses and ARP
5a-
99
Each adapter on LAN has unique LAN address
100. 5: DataLink Layer
LAN Address (more)
5a-
100
MAC address allocation administered by IEEE
manufacturer buys portion of MAC address space (to assure
uniqueness)
Analogy:
(a) MAC address: like Social Security Number
(b) IP address: like postal address
MAC flat address => portability
can move LAN card from one LAN to another
IP hierarchical address NOT portable
depends on IP network to which node is attached
101. 5: DataLink Layer
Recall earlier routing discussion
5a-
101
223.1.1.1
223.1.1.2
223.1.2.1
223.1.1.4 223.1.2.9
223.1.1.3
223.1.2.2
223.1.3.27
223.1.3.1 223.1.3.2
A
B
E
Starting at A, given IP
datagram addressed to B:
look up net. address of B, find B
on same net. as A
link layer send datagram to B
inside link-layer frame
B’s MAC
addr
A’s MAC
addr
A’s IP
addr
B’s IP
addr
IP payload
datagram
frame
frame source,
dest address
datagram source,
dest address
102. ARP: Address Resolution Protocol
5: DataLink Layer
5a-
102
Each IP node (Host,
Router) on LAN has ARP
table
ARP Table: IP/MAC
address mappings for
some LAN nodes
< IP address; MAC address; TTL>
TTL (Time To Live): time
after which address mapping
will be forgotten (typically
20 min)
Question: how to determine
MAC address of B
knowing B’s IP address?
103. 5: DataLink Layer
ARP protocol
5a-
103
A wants to send datagram to B,
and A knows B’s IP address.
Suppose B’s MAC address is
not in A’s ARP table.
A broadcasts ARP query packet,
containing B's IP address
all machines on LAN receive
ARP query
B receives ARP packet, replies
to A with its (B's) MAC address
frame sent to A’s MAC address
(unicast)
A caches (saves) IP-to-MAC
address pair in its ARP table
until information becomes old
(times out)
soft state: information that
times out (goes away) unless
refreshed
ARP is “plug-and-play”:
nodes create their ARP tables
without intervention from
net administrator
104. A
5: DataLink Layer
Routing to another LAN
5a-
104
walkthrough: send datagram from A to B via R
assume A know’s B IP address
Two ARP tables in router R, one for each IP network (LAN)
R
In routing table at source Host, find router 111.111.111.110
In ARP table at source, find MAC address E6-E9-00-17-BB-4B, etc
B
105. A creates datagram with source A, destination B
A uses ARP to get R’s MAC address for 111.111.111.110
A creates link-layer frame with R's MAC address as dest, frame
5: DataLink Layer
5a-
105
contains A-to-B IP datagram
A’s data link layer sends frame
R’s data link layer receives frame
R removes IP datagram from Ethernet frame, sees its destined to
B
R uses ARP to get B’s physical layer address
R creates frame containing A-to-B IP datagram sends to B
A
R
B
106. 5: DataLink Layer
Ethernet
5a-
106
“dominant” LAN technology:
cheap $20 for 100Mbs!
first widely used LAN technology
Simpler, cheaper than token LANs and ATM
Kept up with speed race: 10, 100, 1000 Mbps
Metcalfe’s Ethernet
sketch
107. 5: DataLink Layer
Ethernet Frame Structure
5a-
107
Sending adapter encapsulates IP datagram (or other
network layer protocol packet) in Ethernet frame
Preamble:
7 bytes with pattern 10101010 followed by one byte
with pattern 10101011
used to synchronize receiver, sender clock rates
108. Ethernet Frame Structure (more)
5: DataLink Layer
5a-
108
Addresses: 6 bytes
if adapter receives frame with matching destination address, or with
broadcast address (eg ARP packet), it passes data in frame to net-layer
protocol
otherwise, adapter discards frame
Type: indicates the higher layer protocol, mostly IP but
others may be supported such as Novell IPX and
AppleTalk)
CRC: checked at receiver, if error is detected, the frame is
simply dropped
109. Unreliable, connectionless service
5: DataLink Layer
5a-
109
Connectionless: No handshaking between sending and
receiving adapter.
Unreliable: receiving adapter doesn’t send acks or nacks to
sending adapter
stream of datagrams passed to network layer can have gaps
gaps will be filled if app is using TCP
otherwise, app will see the gaps
110. 5: DataLink Layer
Ethernet uses CSMA/CD
5a-
110
No slots
adapter doesn’t transmit if
it senses that some other
adapter is transmitting, that
is, carrier sense
transmitting adapter aborts
when it senses that another
adapter is transmitting, that
is, collision detection
Before attempting a
retransmission, adapter
waits a random time,
that is, random access
111. 5: DataLink Layer
Ethernet CSMA/CD algorithm
5a-
111
1. Adaptor gets datagram from
and creates frame
2. If adapter senses channel
idle, it starts to transmit
frame. If it senses channel
busy, waits until channel
idle and then transmits
3. If adapter transmits entire
frame without detecting
another transmission, the
adapter is done with frame !
4. If adapter detects another
transmission while
transmitting, aborts and
sends jam signal
5. After aborting, adapter
enters exponential
backoff: after the mth
collision, adapter chooses a
K at random from
{0,1,2,…,2m-1}. Adapter
waits K*512 bit times and
returns to Step 2
112. 5: DataLink Layer
Ethernet’s CSMA/CD (more)
5a-
112
Jam Signal: make sure all other
transmitters are aware of
collision; 48 bits;
Bit time: .1 microsec for 10
Mbps Ethernet ;
for K=1023, wait time is
about 50 msec
Exponential Backoff:
Goal: adapt retransmission
attempts to estimated current
load
heavy load: random wait will
be longer
first collision: choose K from
{0,1}; delay is K x 512 bit
transmission times
after second collision: choose
K from {0,1,2,3}…
after ten collisions, choose K
from {0,1,2,3,4,…,1023}
See/interact with Java
applet on AWL Web site:
highly recommended !
113. 5: DataLink Layer
CSMA/CD efficiency
5a-
113
Tprop = max prop between 2 nodes in LAN
ttrans = time to transmit max-size frame
Efficiency goes to 1 as tprop goes to 0
Goes to 1 as ttrans goes to infinity
Much better than ALOHA, but still decentralized, simple, and cheap
1
prop trans 1 5t / t
efficiency
114. Ethernet Technologies: 10Base2
5: DataLink Layer
5a-
114
10: 10Mbps; 2: under 200 meters max cable length
thin coaxial cable in a bus topology
repeaters used to connect up to multiple segments
repeater repeats bits it hears on one interface to its other interfaces: physical layer device only!
has become a legacy technology
115. 10BaseT and 100BaseT 10/100 Mbps rate; latter called “fast ethernet”
T stands for Twisted Pair
Nodes connect to a hub: “star topology”; 100 m max distance between nodes and hub
5: DataLink Layer
5a-
115
Hubs are essentially physical-layer repeaters:
bits coming in one link go out all other links
no frame buffering
no CSMA/CD at hub: adapters detect collisions
provides net management functionality
hub
nodes
116. 5: DataLink Layer
Manchester encoding
5a-
116
Used in 10BaseT, 10Base2
Each bit has a transition
Allows clocks in sending and receiving nodes to
synchronize to each other
no need for a centralized, global clock among nodes!
Hey, this is physical-layer stuff!
117. 5: DataLink Layer
Gbit Ethernet
5a-
117
use standard Ethernet frame format
allows for point-to-point links and shared broadcast
channels
in shared mode, CSMA/CD is used; short distances
between nodes to be efficient
uses hubs, called here “Buffered Distributors”
Full-Duplex at 1 Gbps for point-to-point links
10 Gbps now !
118. 5: DataLink Layer
Interconnecting LAN segments
5a-
118
Hubs
Bridges
Switches
Remark: switches are essentially multi-port bridges.
What we say about bridges also holds for switches!
119. 5: DataLink Layer
Interconnecting with hubs
5a-
119
Backbone hub interconnects LAN segments
Extends max distance between nodes
But individual segment collision domains become one large collision
domian
if a node in CS and a node EE transmit at same time: collision
Can’t interconnect 10BaseT & 100BaseT
120. 5: DataLink Layer
Bridges
5a-
120
Link layer device
stores and forwards Ethernet frames
examines frame header and selectively forwards frame
based on MAC dest address
when frame is to be forwarded on segment, uses
CSMA/CD to access segment
transparent
hosts are unaware of presence of bridges
plug-and-play, self-learning
bridges do not need to be configured
121. 5: DataLink Layer
Bridges: traffic isolation
5a-
121
Bridge installation breaks LAN into LAN segments
bridges filter packets:
same-LAN-segment frames not usually forwarded onto
other LAN segments
segments become separate collision domains
bridge
collision
domain
collision
domain
= hub
= host
LAN segment LAN segment
LAN (IP network)
122. 5: DataLink Layer
Forwarding
5a-
122
How do determine to which LAN segment to forward
frame?
• Looks like a routing problem...
123. 5: DataLink Layer
Self learning
5a-
123
A bridge has a bridge table
entry in bridge table:
(Node LAN Address, Bridge Interface, Time Stamp)
stale entries in table dropped (TTL can be 60 min)
bridges learn which hosts can be reached through which
interfaces
when frame received, bridge “learns” location of sender:
incoming LAN segment
records sender/location pair in bridge table
124. 5: DataLink Layer
Filtering/Forwarding
5a-
124
When bridge receives a frame:
index bridge table using MAC dest address
if entry found for destination
then{
if dest on segment from which frame arrived
then drop the frame
else forward the frame on interface indicated
}
else flood
forward on all but the interface
on which the frame arrived
125. 5: DataLink Layer
Bridge example
5a-
125
Suppose C sends frame to D and D replies back with
frame to C.
Bridge receives frame from from C
notes in bridge table that C is on interface 1
because D is not in table, bridge sends frame into interfaces
2 and 3
frame received by D
126. 5: DataLink Layer
Bridge Learning: example
5a-
126
D generates frame for C, sends
bridge receives frame
notes in bridge table that D is on interface 2
bridge knows C is on interface 1, so selectively forwards
frame to interface 1
127. Interconnection without backbone
5: DataLink Layer
5a-
127
Not recommended for two reasons:
- single point of failure at Computer Science hub
- all traffic between EE and SE must path over CS segment
129. 5: DataLink Layer
Bridges Spanning Tree
5a-
129
for increased reliability, desirable to have redundant,
alternative paths from source to dest
with multiple paths, cycles result - bridges may multiply
and forward frame forever
solution: organize bridges in a spanning tree by disabling
subset of interfaces
Disabled
130. 5: DataLink Layer
Some bridge features
5a-
130
Isolates collision domains resulting in higher total
max throughput
limitless number of nodes and geographical coverage
Can connect different Ethernet types
Transparent (“plug-and-play”): no configuration
necessary
131. 5: DataLink Layer
Bridges vs. Routers
5a-
131
both store-and-forward devices
routers: network layer devices (examine network layer headers)
bridges are link layer devices
routers maintain routing tables, implement routing
algorithms
bridges maintain bridge tables, implement filtering, learning
and spanning tree algorithms
132. 5: DataLink Layer
Routers vs. Bridges
5a-
132
Bridges + and -
+ Bridge operation is simpler requiring less packet processing
+ Bridge tables are self learning
- All traffic confined to spanning tree, even when alternative
bandwidth is available
- Bridges do not offer protection from broadcast storms
133. 5: DataLink Layer
Routers vs. Bridges
5a-
133
Routers + and -
+ arbitrary topologies can be supported, cycling is limited
by TTL counters (and good routing protocols)
+ provide protection against broadcast storms
- require IP address configuration (not plug and play)
- require higher packet processing
bridges do well in small (few hundred hosts) while
routers used in large networks (thousands of hosts)
134. 5: DataLink Layer
Ethernet Switches
5a-
134
Essentially a multi-interface
bridge
layer 2 (frame) forwarding,
filtering using LAN addresses
Switching: A-to-A’ and B-to-
B’ simultaneously, no
collisions
large number of interfaces
often: individual hosts, star-connected
into switch
Ethernet, but no collisions!
135. 5: DataLink Layer
Ethernet Switches
5a-
135
cut-through switching: frame forwarded from input
to output port without awaiting for assembly of
entire frame
slight reduction in latency
combinations of shared/dedicated, 10/100/1000
Mbps interfaces
136. Not an atypical LAN (IP network)
5: DataLink Layer
5a-
136
Dedicated
Shared
137. 5: DataLink Layer
5a-137
Summary comparison
hubs bridges routers switches
traffic
isolation
no yes yes yes
plug & play yes yes no yes
optimal
routing
no no yes no
cut
through
yes no no yes
138. 5: DataLink Layer
IEEE 802.11 Wireless LAN
5a-
138
802.11b
2.4-5 GHz unlicensed radio
spectrum
up to 11 Mbps
direct sequence spread
spectrum (DSSS) in physical
layer
all hosts use same
chipping code
widely deployed, using base
stations
802.11a
5-6 GHz range
up to 54 Mbps
802.11g
2.4-5 GHz range
up to 54 Mbps
All use CSMA/CA for
multiple access
All have base-station
and ad-hoc network
versions
139. 5: DataLink Layer
Base station approch
5a-
139
Wireless host communicates with a base station
base station = access point (AP)
Basic Service Set (BSS) (a.k.a. “cell”) contains:
wireless hosts
access point (AP): base station
BSS’s combined to form distribution system (DS)
140. 5: DataLink Layer
Ad Hoc Network approach
5a-
140
No AP (i.e., base station)
wireless hosts communicate with each other
to get packet from wireless host A to B may need to route
through wireless hosts X,Y,Z
Applications:
“laptop” meeting in conference room, car
interconnection of “personal” devices
battlefield
IETF MANET
(Mobile Ad hoc Networks)
working group
141. 5: DataLink Layer
IEEE 802.11: multiple access
5a-
141
Collision if 2 or more nodes transmit at same time
CSMA makes sense:
get all the bandwidth if you’re the only one transmitting
shouldn’t cause a collision if you sense another transmission
Collision detection doesn’t work: hidden terminal
problem
142. IEEE 802.11 MAC Protocol: CSMA/CA
5: DataLink Layer
5a-
142
802.11 CSMA: sender
- if sense channel idle for DISF
sec.
then transmit entire frame (no
collision detection)
-if sense channel busy
then binary backoff
802.11 CSMA receiver
- if received OK
return ACK after SIFS
(ACK is needed due to hidden
terminal problem)
143. 5: DataLink Layer
Collision avoidance mechanisms
5a-
143
Problem:
two nodes, hidden from each other, transmit complete frames
to base station
wasted bandwidth for long duration !
Solution:
small reservation packets
nodes track reservation interval with internal “network
allocation vector” (NAV)
144. Collision Avoidance: RTS-CTS exchange
5: DataLink Layer
5a-
144
sender transmits short RTS
(request to send) packet:
indicates duration of
transmission
receiver replies with short
CTS (clear to send) packet
notifying (possibly hidden)
nodes
hidden nodes will not
transmit for specified
duration: NAV
145. Collision Avoidance: RTS-CTS exchange
5: DataLink Layer
5a-
145
RTS and CTS short:
collisions less likely, of shorter
duration
end result similar to collision
detection
IEEE 802.11 allows:
CSMA
CSMA/CA: reservations
polling from AP
146. 5: DataLink Layer
A word about Bluetooth
5a-
146
Low-power, small radius,
wireless networking
technology
10-100 meters
omnidirectional
not line-of-sight infared
Interconnects gadgets
2.4-2.5 GHz unlicensed
radio band
up to 721 kbps
Interference from wireless
LANs, digital cordless
phones, microwave ovens:
frequency hopping helps
MAC protocol supports:
error correction
ARQ
Each node has a 12-bit
address
147. Point to Point Data Link Control
5: DataLink Layer
5a-
147
one sender, one receiver, one link: easier than
broadcast link:
no Media Access Control
no need for explicit MAC addressing
e.g., dialup link, ISDN line
popular point-to-point DLC protocols:
PPP (point-to-point protocol)
HDLC: High level data link control (Data link used to be
considered “high layer” in protocol stack!
148. PPP Design Requirements [RFC 1557]
5: DataLink Layer
5a-
148
packet framing: encapsulation of network-layer
datagram in data link frame
carry network layer data of any network layer protocol
(not just IP) at same time
ability to demultiplex upwards
bit transparency: must carry any bit pattern in the
data field
error detection (no correction)
connection liveness: detect, signal link failure to
network layer
network layer address negotiation: endpoint can
learn/configure each other’s network address
149. 5: DataLink Layer
PPP non-requirements
5a-
149
no error correction/recovery
no flow control
out of order delivery OK
no need to support multipoint links (e.g., polling)
Error recovery, flow control, data re-ordering
all relegated to higher layers!
150. 5: DataLink Layer
PPP Data Frame
5a-
150
Flag: delimiter (framing)
Address: does nothing (only one option)
Control: does nothing; in the future possible multiple
control fields
Protocol: upper layer protocol to which frame delivered (eg,
PPP-LCP, IP, IPCP, etc)
151. 5: DataLink Layer
PPP Data Frame
5a-
151
info: upper layer data being carried
check: cyclic redundancy check for error detection
152. 5: DataLink Layer
Byte Stuffing
5a-
152
“data transparency” requirement: data field must be
allowed to include flag pattern <01111110>
Q: is received <01111110> data or flag?
Sender: adds (“stuffs”) extra < 01111110> byte after
each < 01111110> data byte
Receiver:
two 01111110 bytes in a row: discard first byte, continue
data reception
single 01111110: flag byte
153. 5: DataLink Layer
Byte Stuffing
5a-
153
flag byte
pattern
in data
to send
flag byte pattern plus
stuffed byte in
transmitted data
154. 5: DataLink Layer
PPP Data Control Protocol
5a-
154
Before exchanging network-layer
data, data link peers
must
configure PPP link (max.
frame length, authentication)
learn/configure network
layer information
for IP: carry IP Control
Protocol (IPCP) msgs (protocol
field: 8021) to configure/learn
IP address
155. Asynchronous Transfer Mode: ATM
5: DataLink Layer
5a-
155
1990’s/00 standard for high-speed (155Mbps to
622 Mbps and higher) Broadband Integrated Service
Digital Network architecture
Goal: integrated, end-end transport of carry voice,
video, data
meeting timing/QoS requirements of voice, video (versus
Internet best-effort model)
“next generation” telephony: technical roots in telephone
world
packet-switching (fixed length packets, called “cells”)
using virtual circuits
156. 5: DataLink Layer
ATM architecture
5a-
156
adaptation layer: only at edge of ATM network
data segmentation/reassembly
roughly analagous to Internet transport layer
ATM layer: “network” layer
cell switching, routing
physical layer
157. 5: DataLink Layer
ATM: network or link layer?
5a-
157
Vision: end-to-end
transport: “ATM from
desktop to desktop”
ATM is a network
technology
Reality: used to connect IP
backbone routers
“IP over ATM”
ATM as switched link
layer, connecting IP
routers
158. 5: DataLink Layer
ATM Adaptation Layer (AAL)
5a-
158
ATM Adaptation Layer (AAL): “adapts” upper layers (IP
or native ATM applications) to ATM layer below
AAL present only in end systems, not in switches
AAL layer segment (header/trailer fields, data) fragmented
across multiple ATM cells
analogy: TCP segment in many IP packets
159. ATM Adaptation Layer (AAL) [more]
User data
5: DataLink Layer
5a-
159
Different versions of AAL layers, depending on ATM service
class:
AAL1: for CBR (Constant Bit Rate) services, e.g. circuit emulation
AAL2: for VBR (Variable Bit Rate) services, e.g., MPEG video
AAL5: for data (eg, IP datagrams)
AAL PDU
ATM cell
160. AAL5 - Simple And Efficient AL (SEAL)
5: DataLink Layer
5a-
160
AAL5: low overhead AAL used to carry IP
datagrams
4 byte cyclic redundancy check
PAD ensures payload multiple of 48bytes
large AAL5 data unit to be fragmented into 48-byte ATM cells
161. 5: DataLink Layer
ATM Layer
5a-
161
Service: transport cells across ATM network
analagous to IP network layer
very different services than IP network layer
Network
Architecture
Internet
ATM
ATM
ATM
ATM
Service
Model
best effort
CBR
VBR
ABR
UBR
Bandwidth
none
constant
rate
guaranteed
rate
guaranteed
minimum
none
Loss
no
yes
yes
no
no
Order
no
yes
yes
yes
yes
Timing
no
yes
yes
no
no
Congestion
feedback
no (inferred
via loss)
no
congestion
no
congestion
yes
no
Guarantees ?
162. 5: DataLink Layer
ATM Layer: Virtual Circuits
5a-
162
VC transport: cells carried on VC from source to dest
call setup, teardown for each call before data can flow
each packet carries VC identifier (not destination ID)
every switch on source-dest path maintain “state” for each passing
connection
link,switch resources (bandwidth, buffers) may be allocated to VC: to
get circuit-like perf.
Permanent VCs (PVCs)
long lasting connections
typically: “permanent” route between to IP routers
Switched VCs (SVC):
dynamically set up on per-call basis
163. 5: DataLink Layer
ATM VCs
5a-
163
Advantages of ATM VC approach:
QoS performance guarantee for connection mapped to
VC (bandwidth, delay, delay jitter)
Drawbacks of ATM VC approach:
Inefficient support of datagram traffic
one PVC between each source/dest pair) does not scale
(N*2 connections needed)
SVC introduces call setup latency, processing overhead
for short lived connections
164. 5: DataLink Layer
ATM Layer: ATM cell
5a-
164
5-byte ATM cell header
48-byte payload
Why?: small payload -> short cell-creation delay for
digitized voice
halfway between 32 and 64 (compromise!)
Cell header
Cell format
165. 5: DataLink Layer
ATM cell header
5a-
165
VCI: virtual channel ID
will change from link to link thru net
PT: Payload type (e.g. RM cell versus data cell)
CLP: Cell Loss Priority bit
CLP = 1 implies low priority cell, can be discarded if
congestion
HEC: Header Error Checksum
cyclic redundancy check
166. 5: DataLink Layer
ATM Physical Layer (more)
5a-
166
Two pieces (sublayers) of physical layer:
Transmission Convergence Sublayer (TCS): adapts ATM
layer above to PMD sublayer below
Physical Medium Dependent: depends on physical
medium being used
TCS Functions:
Header checksum generation: 8 bits CRC
Cell delineation
With “unstructured” PMD sublayer, transmission of idle
cells when no data cells to send
168. 5: DataLink Layer
IP-Over-ATM
5a-
168
Classic IP only
3 “networks” (e.g., LAN segments)
MAC (802.3) and IP addresses
IP over ATM
replace “network” (e.g.,
LAN segment) with
ATM network
ATM addresses, IP
addresses
ATM
network
Ethernet
LANs
Ethernet
LANs
169. 5: DataLink Layer
IP-Over-ATM
5a-
169
Issues:
IP datagrams into
ATM AAL5 PDUs
from IP addresses to
ATM addresses
just like IP
addresses to 802.3
MAC addresses!
ATM
network
Ethernet
LANs
170. Datagram Journey in IP-over-ATM Network
5: DataLink Layer
5a-
170
at Source Host:
IP layer maps between IP, ATM dest address (using ARP)
passes datagram to AAL5
AAL5 encapsulates data, segments cells, passes to ATM layer
ATM network: moves cell along VC to destination
at Destination Host:
AAL5 reassembles cells into original datagram
if CRC OK, datagram is passed to IP
171. 5: DataLink Layer
Frame Relay
5a-
171
Like ATM:
wide area network technologies
Virtual-circuit oriented
origins in telephony world
can be used to carry IP datagrams
can thus be viewed as link layers by IP protocol
172. 5: DataLink Layer
Frame Relay
5a-
172
Designed in late ‘80s, widely deployed in the ‘90s
Frame relay service:
no error control
end-to-end congestion control
173. 5: DataLink Layer
Frame Relay (more)
5a-
173
Designed to interconnect corporate customer LANs
typically permanent VC’s: “pipe” carrying aggregate traffic
between two routers
switched VC’s: as in ATM
corporate customer leases FR service from public
Frame Relay network (eg, Sprint, ATT)
174. flags address data CRC flags
5: DataLink Layer
Frame Relay (more)
5a-
174
Flag bits, 01111110, delimit frame
address:
10 bit VC ID field
3 congestion control bits
FECN: forward explicit congestion notification
(frame experienced congestion on path)
BECN: congestion on reverse path
DE: discard eligibility
175. 5: DataLink Layer
Frame Relay -VC Rate Control
5a-
175
Committed Information Rate (CIR)
defined, “guaranteed” for each VC
negotiated at VC set up time
customer pays based on CIR
DE bit: Discard Eligibility bit
Edge FR switch measures traffic rate for each VC; marks DE
bit
DE = 0: high priority, rate compliant frame; deliver at “all
costs”
DE = 1: low priority, eligible for congestion discard
176. Frame Relay - CIR & Frame Marking
5: DataLink Layer
5a-
176
Access Rate: rate R of the access link between source
router (customer) and edge FR switch (provider);
64Kbps < R < 1,544Kbps
Typically, many VCs (one per destination router)
multiplexed on the same access trunk; each VC has own
CIR
Edge FR switch measures traffic rate for each VC; it
marks (ie DE = 1) frames which exceed CIR (these may
be later dropped)
Internet’s more recent differentiated service uses similar
ideas
177. 5: DataLink Layer
Summary
5a-
177
principles behind data link layer services:
error detection, correction
sharing a broadcast channel: multiple access
link layer addressing, ARP
link layer technologies: Ethernet, hubs,
bridges, switches,IEEE 802.11 LANs, PPP,
ATM, Frame Relay
journey down the protocol stack now OVER!
next stops: multimedia, security, network
management
179. Network Layer
Goals:
understand principles
behind network layer
services:
routing (path selection)
dealing with scale
how a router works
advanced topics: IPv6,
multicast
instantiation and
implementation in the
Internet
Overview:
network layer services
routing principle: path
selection
hierarchical routing
IP
Internet routing protocols
reliable transfer
intra-domain
inter-domain
what’s inside a router?
IPv6
multicast routing
180. Network layer functions
transport packet from sending to
receiving hosts
network layer protocols in every
host, router
three important functions:
path determination: route taken
by packets from source to dest.
Routing algorithms
switching: move packets from
router’s input to appropriate
router output
call setup: some network
architectures require router call
setup along path before data flows
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical
181. Network service model
Q: What service model for
“channel” transporting
packets from sender to
receiver?
guaranteed bandwidth?
preservation of inter-packet
timing (no jitter)?
loss-free delivery?
in-order delivery?
congestion feedback to
sender?
The most important
abstraction provided
by network layer:
?? virtual circuit
or
datagram?
?
182. Virtual circuits
“source-to-dest path behaves much like telephone circuit”
performance-wise
network actions along source-to-dest path
call setup, teardown for each call before data can flow
each packet carries VC identifier (not destination host ID)
every router on source-dest path maintains “state” for each passing
connection
(in contrast, transport-layer connection only involved two end systems)
link, router resources (bandwidth, buffers) may be allocated to VC
to get circuit-like performance
183. Virtual circuits: signaling protocols
used to set up, maintain, and tear down VC
used in ATM, frame-relay, X.25
not used in today’s Internet
application
transport
network
data link
physical
application
transport
network
data link
physical
5. Data flow begins 6. Receive data
4. Call connected 3. Accept call
1. Initiate call 2. incoming call
184. Datagram networks: the Internet model
no call setup at network layer
routers: no state about end-to-end connections
no network-level concept of “connection”
packets typically routed using destination host ID
packets between same source-dest pair may take
different paths
application
transport
network
data link
physical
application
transport
network
data link
physical
1. Send data 2. Receive data
185. Network layer service models:
Network
Architecture
Internet
ATM
ATM
ATM
ATM
Service
Model
best effort
CBR
VBR
ABR
UBR
Bandwidth
none
constant
rate
guaranteed
rate
guaranteed
minimum
none
Loss
no
yes
yes
no
no
Order
no
yes
yes
yes
yes
Timing
no
yes
yes
no
no
Congestion
feedback
no (inferred
via loss)
no
congestion
no
congestion
yes
no
Guarantees ?
• Internet model being extended: Intserv, Diffserv
– Chapter 6
186. Datagram or VC network: why?
Internet
data exchange among computers
“elastic” service, no strict timing
req.
“smart” end systems (computers)
can adapt, perform control, error
recovery
simple inside network, complexity
at “edge”
easier to connect many link types
different characteristics
uniform service difficult
ATM
evolved from telephony
human conversation:
strict timing, reliability
requirements
need for guaranteed
service
“dumb” end systems
telephones
complexity inside
network
187. Routing
Routing protocol
Goal: determine “good” path
(sequence of routers) thru
network from source to dest.
Graph abstraction for
routing algorithms:
graph nodes are routers
graph edges are
physical links
link cost: delay, $ cost,
or congestion level
A
B C
D E
F
2
2
1
3
1
1
5
2
3
5
• “good” path:
– typically means minimum
cost path
– other definitions possible
188. Routing Algorithm classification
Global or decentralized
information?
Global:
all routers have complete
topology, link cost info
“link state” algorithms
Decentralized:
router knows physically-connected
neighbors, link
costs to neighbors
iterative process of
computation, exchange of info
with neighbors
“distance vector” algorithms
Static or dynamic?
Static:
routes change slowly
over time (usually by
humans)
Dynamic:
routes change more
quickly/automatically
periodic update
in response to link cost
changes
189. A Link-State Routing Algorithm
Dijkstra’s algorithm
net topology, link costs known
to all nodes
accomplished via “link state
broadcast”
all nodes have same info
computes least cost paths from
one node (‘source”) to all other
nodes
gives routing table for that
node
iterative: after k iterations,
know least cost path to k
destinations
Notation:
c(i,j): link cost from node i
to j. cost infinite if not
direct neighbors
D(v): current value of cost
of path from source to
dest. V
p(v): predecessor node
along path from source to
v, that is next v
N: set of nodes whose least
cost path definitively
known
190. Dijsktra’s Algorithm
1 Initialization:
2 N = {A}
3 for all nodes v
4 if v adjacent to A
5 then D(v) = c(A,v)
6 else D(v) = infty
7
8 Loop
9 find w not in N such that D(w) is a minimum (of nodes adjacent to previous w)
10 add w to N
11 update D(v) for all v adjacent to w and not in N:
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N
191. Dijkstra’s algorithm: example
Step
0
1
2
3
4
5
start N
A
AD
ADE
ADEB
ADEBC
ADEBCF
D(B),p(B)
2,A
2,A
2,A
D(C),p(C)
5,A
4,D
3,E
3,E
D(D),p(D)
1,A
D(E),p(E)
infinity
2,D
D(F),p(F)
infinity
infinity
4,E
4,E
4,E
A
B C
D E
F
2
2
1
3
1
1
5
2
3
5
192. Dijkstra’s algorithm, discussion
Algorithm complexity: n nodes
each iteration: need to check all nodes, w, not in N
n*(n+1)/2 comparisons: O(n**2)
more efficient implementations possible: O(nlogn)
Oscillations possible:
e.g., Suppose link cost = amount of carried traffic (note: c(i,j) !=
c(j,i))
A
D
1 1+e
0 0
C
B
0 e
1 1
e
A
2+e 0
D
1+e 1
C
B
0 0
A
0 2+e
D
0 0
C
B
1 1+e
A
2+e 0
D
1+e 1
C
B
0 e
initially
… recompute
routing
… recompute … recompute
194. Networks: Routing
Distance Vector Routing
194
Historically known as the old ARPANET routing
algorithm {or known as Bellman-Ford algorithm}.
Basic idea: each network node maintains a Distance
Vector table containing the distance between itself
and ALL possible destination nodes.
Distances are based on a chosen metric and are
computed using information from the neighbors’
distance vectors.
Metric: usually hops or delay
195. Networks: Routing
Distance Vector Routing
Information kept 195
by DV router
1. each router has an ID
2. associated with each link connected to a router,
there is a link cost (static or dynamic) the metric
issue!
Distance Vector Table Initialization
Distance to itself = 0
Distance to ALL other routers = infinity number
196. Distance Vector Algorithm [Perlman]
Networks: Routing
196
1. Router transmits its distance vector to each of
its neighbors.
2. Each router receives and saves the most recently
received distance vector from each of its
neighbors.
3. A router recalculates its distance vector when:
a. It receives a distance vector from a neighbor
containing different information than before.
b. It discovers that a link to a neighbor has gone down
(i.e., a topology change).
The DV calculation is based on minimizing the
cost to each destination.
197. Networks: Routing
Distance Vector Routing
197
Figure 5-9.(a) A subnet. (b) Input from A, I, H, K, and
the new routing table for J.
198. Routing Information Protocol (RIP)
Networks: Routing
198
RIP had widespread use because it was
distributed with BSD Unix in “routed”, a
router management daemon.
RIP is the most used Distance Vector
protocol.
RFC1058 in June 1988.
Sends packets every 30 seconds or faster.
Runs over UDP.
Metric = hop count
BIG problem is max. hop count =16
RIP limited to running on small networks!!
Upgraded to RIPv2
199. Networks: Routing
Link State Algorithm
199
1. Each router is responsible for meeting its
neighbors and learning their names.
2. Each router constructs a link state packet (LSP)
which consists of a list of names and cost to reach
each of its neighbors.
3. The LSP is transmitted to ALL other routers.
Each router stores the most recently generated
LSP from each other router.
4. Each router uses complete information on the
network topology to compute the shortest path
route to each destination node.
200. Open Shortest Path First (OSPF)
Networks: Routing
200
OSPF runs on top of IP, i.e., an OSPF packet is
transmitted with IP data packet header.
Uses Level 1 and Level 2 routers
Has: backbone routers, area border routers,
and AS boundary routers
LSPs referred to as LSAs (Link State
Advertisements)
Complex algorithm due to five distinct LSA
types.
202. Networks: Routing
OSPF
202
Figure 5-65.The relation between ASes,
backbones, and areas in OSPF.
203. Border Gateway Protocol (BGP)
Networks: Routing
203
The replacement for EGP is BGP. Current version
is BGP-4.
BGP assumes the Internet is an arbitrary
interconnected set of AS’s.
In interdomain routing the goal is to find ANY
path to the intended destination that is loop-free.
The protocols are more concerned with
reachability than optimality.
205. Transport services and protocols
3-
205
provide logical
communication between
app processes running on
different hosts
transport protocols run in
end systems
send side: breaks app
messages into segments,
passes to network layer
rcv side: reassembles
segments into messages,
passes to app layer
more than one transport
protocol available to apps
Internet: TCP and UDP
application
transport
network
data link
physical
application
transport
network
data link
physical
network
data link
network physical
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
206. Transport vs. network layer
3-
206
network layer: logical
communication
between hosts
transport layer:
logical
communication
between processes
relies on, enhances,
network layer services
Household analogy:
12 kids sending letters
to 12 kids
processes = kids
app messages =
letters in envelopes
hosts = houses
transport protocol =
Ann and Bill
network-layer
protocol = postal
service
Another analogy:
1. Post office -> network layer
2. My wife -> transport layer
207. Internet transport-layer protocols
3-
207
reliable, in-order
delivery (TCP)
congestion control
(distributed control)
flow control
connection setup
unreliable, unordered
delivery: UDP
no-frills extension of
“best-effort” IP
services not available:
delay guarantees
bandwidth guarantees
application
transport
network
data link
physical
application
transport
network
data link
physical
network
data link
network physical
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
Research issues
208. Multiplexing/demultiplexing
3-
208
Demultiplexing at rcv host:
delivering received segments
to correct socket
= socket = process
application
transport
network
link
physical
Multiplexing at send host:
gathering data from multiple
sockets, enveloping data with
header (later used for
demultiplexing)
P3 P1 P2 P4
P1 application
transport
network
link
physical
application
transport
network
link
physical
host 1 host 2 host 3
FTP telnet
209. How demultiplexing works
3-
209
host receives IP datagrams
each datagram has source IP
address, destination IP address
each datagram carries 1 transport-layer
segment
each segment has source,
destination port number
(recall: well-known port numbers
for specific applications)
host uses IP addresses & port numbers
to direct segment to appropriate socket
32 bits
source port # dest port #
other header fields
application
data
(message)
TCP/UDP segment format
210. Connectionless demultiplexing
3-
210
Create sockets with port
numbers:
DatagramSocket mySocket1 = new
DatagramSocket(99111);
DatagramSocket mySocket2 = new
DatagramSocket(99222);
UDP socket identified by
two-tuple:
(dest IP address, dest port number)
When host receives UDP
segment:
checks destination port
number in segment
directs UDP segment to
socket with that port
number
IP datagrams with
different source IP
addresses and/or source
port numbers directed to
same socket (this is how a
system can serve multiple
requests!!)
211. Connectionless demux (cont)
3-
211
DatagramSocket serverSocket = new DatagramSocket(6428);
Client
IP:B
P3
client
IP: A
Based on destination
PP11 P3
server
IP: C
SP: 6428
DP: 9157
SP: 9157
DP: 6428
SP: 6428
DP: 5775
SP: 5775
DP: 6428
SP provides “return address”
Source IP and port # can be spoofed !!!!
IP and port #
212. Connection-oriented demux
3-
212
TCP socket identified
by 4-tuple:
source IP address
source port number
dest IP address
dest port number
recv host uses all four
values to direct
segment to
appropriate socket
Server host may
support many
simultaneous TCP
sockets:
each socket identified by
its own 4-tuple
Web servers have
different sockets for
each connecting client
non-persistent HTTP will
have different socket for
each request
214. UDP: User Datagram Protocol [RFC 768]
3-
214
“no frills,” “bare bones”
Internet transport
protocol
“best effort” service, UDP
segments may be:
lost
delivered out of order
to app
connectionless:
no handshaking
between UDP sender,
receiver
each UDP segment
handled independently
of others
Why is there a UDP?
no connection
establishment (which can
add delay)
simple: no connection
state at sender, receiver
small segment header
no congestion control:
UDP can blast away as
fast as desired
215. UDP: more
3-
215
often used for streaming
multimedia apps
loss tolerant
rate sensitive
other UDP uses
DNS
SNMP
reliable transfer over
UDP: add reliability at
application layer
application-specific
error recovery! (e.g,
FTP based on UDP but
with recovery)
32 bits
source port # dest port #
length checksum
Application
data
(message)
UDP segment format
Length, in
bytes of UDP
segment,
including
header
When the network is
stressed, you PRAY!
216. UDP checksum
3-
216
Goal: detect “errors” (e.g., flipped bits) in
Sender:
treat segment contents
as sequence of 16-bit
integers
checksum: addition (1’s
complement sum) of
segment contents
sender puts checksum
value into UDP
checksum field
Receiver:
compute checksum of
received segment
check if computed checksum
equals checksum field value:
NO - error detected
YES - no error detected.
But maybe errors
nonetheless? More later
….
transmitted segment
e.g: 1+2+3 = 6. So is 0+3+3=6
217. Internet Checksum Example
3-217
Note
When adding numbers, a carryout from the
most significant bit needs to be added to
the result
Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
sum
checksum
218. Principles of Reliable data transfer
3-
218
important in app., transport, link layers
top-10 list of important networking topics!
abstraction
characteristics of unreliable channel will determine complexity of reliable data
transfer protocol (rdt)!!!!!!!!
This picture sets the scenario
219. Reliable data transfer: getting started
3-
219
send
side
deliver_data(): called
by rdt to deliver data to
receive
side
rdt_send(): called from above,
(e.g., by app.). Passed data to
deliver to receiver upper layer
udt_send(): called by rdt,
to transfer packet over
unreliable channel to
receiver
upper
rdt_rcv(): called when packet
arrives on rcv-side of channel
** Let us now look at the gut of these modules. Any question?
220. (DON’T FALL ASLEEP!!!)
Reliable data transfer: getting started
3-
We’ll:
220
incrementally develop sender, receiver
sides of reliable data transfer protocol
(rdt)
consider only unidirectional data transfer
but control info will flow on both directions!
use finite state machines (FSM) to specify
sender, receiver
state
1
state
2
event causing state transition
actions taken on state transition
state: when in this
“state” next state
uniquely
determined by
next event
event
actions
Event: timer, receives message, …etc.
Action: executes a program, send message, …etc.
221. Rdt1.0: reliable transfer over a reliable channel
3-
221
underlying channel perfectly reliable
no bit errors
no loss of packets
In reality, this is an unrealistic assumption, but..
separate FSMs for sender, receiver:
sender sends data into underlying channel
receiver reads data from underlying channel
rdt_send(data)
Wait for
call from
above packet = make_pkt(data)
udt_send(packet)
extract (packet,data)
deliver_data(data)
Wait for
call from
below
rdt_rcv(packet)
sender receiver
222. Rdt2.0: channel with bit errors
3-
222
underlying channel may flip bits in packet
recall: UDP checksum to detect bit errors
the question: how to recover from errors:
acknowledgements (ACKs): receiver explicitly tells
sender that pkt received OK
negative acknowledgements (NAKs): receiver
explicitly tells sender that pkt had errors
sender retransmits pkt on receipt of NAK
human scenarios using ACKs, NAKs?
new mechanisms in rdt2.0 (beyond rdt1.0):
error detection
receiver feedback: control msgs (ACK,NAK) rcvr-
>sender
Ack: I love u, I love u 2.
Nak: I love u, I don’t love u
223. rdt2.0: FSM specification
3-
223
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
Wait for
call from
above
receiver
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(NAK)
Wait for
ACK or
NAK
Wait for
L
call from
sender
below Buffer is needed to
store data from
application layer or
to block call.
224. rdt2.0: operation with no errors
3-
224
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
Wait for
call from
above
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(NAK)
Wait for
ACK or
NAK
Wait for
call from
below
rdt_send(data)
L
225. rdt2.0: error scenario
3-
225
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
Wait for
call from
above
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(NAK)
Wait for
ACK or
NAK
Wait for
call from
below
rdt_send(data)
L
GOT IT ?
226. rdt2.0 has a fatal flaw!
3-
226
What happens if
ACK/NAK corrupted?
sender doesn’t know what
happened at receiver!
can’t just retransmit:
possible duplicate
What to do?
sender ACKs/NAKs
receiver’s ACK/NAK? What
if sender ACK/NAK lost?
retransmit, but this might
cause retransmission of
correctly received pkt!
Handling duplicates:
sender adds sequence
number to each pkt
sender retransmits
current pkt if ACK/NAK
garbled
receiver discards (doesn’t
deliver up) duplicate pkt
stop and wait protocol
Sender sends one packet,
then waits for receiver
response
227. rdt2.1: sender, handles garbled ACK/NAKs
3-
227
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
Wait for
call 0 from
above
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
Wait for
ACK or
NAK 0 udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
udt_send(sndpkt)
Wait for
call 1 from
above
Wait for
ACK or
NAK 1
L
L
THE FSM GETS MESSY!!!
229. rdt2.1: discussion
3-
229
Sender:
seq # added to pkt
two seq. #’s (0,1) will
suffice. Why?
must check if received
ACK/NAK corrupted
twice as many states
state must “remember”
whether “current” pkt
has 0 or 1 seq. #
Receiver:
must check if received
packet is duplicate
state indicates whether 0
or 1 is expected pkt seq #
note: receiver can not
know if its last
ACK/NAK received OK
at sender
230. rdt2.2: a NAK-free protocol
3-
230
same functionality as rdt2.1, using ACKs only
instead of NAK, receiver sends ACK for last pkt
received OK
receiver must explicitly include seq # of pkt being ACKed
duplicate ACK at sender results in same action as
NAK: retransmit current pkt
This is important because TCP uses this
approach (NO NAC).
231. rdt2.2: sender, receiver fragments
3-
231
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
Wait for
call 0 from
above
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
udt_send(sndpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
Wait for
ACK
0
sender FSM
fragment
Wait for
0 from
below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) ||
has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSM
fragment
L
232. rdt3.0: channels with errors and loss
3-
232
New assumption:
underlying channel
can also lose packets
(data or ACKs)
checksum, seq. #, ACKs,
retransmissions will be
of help, but not enough
Q: how to deal with
loss?
sender waits until
certain data or ACK lost,
then retransmits
yuck: drawbacks?
Approach: sender waits
“reasonable” amount of
time for ACK
retransmits if no ACK received
in this time
if pkt (or ACK) just delayed (not
lost):
retransmission will be
duplicate, but use of seq. #’s
already handles this
receiver must specify seq # of
pkt being ACKed
requires countdown timer
What is the “right value” for timer? It depends on the flow and network condition!
233. 3-
233
rdt3.0 sender
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
start_timer
Wait
for
ACK0
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
L
timeout
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
Wait for
call 1 from
above
rdt_send(data)
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
start_timer
rdt_rcv(rcvpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,1)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,0) )
stop_timer
stop_timer
udt_send(sndpkt)
start_timer
timeout
udt_send(sndpkt)
start_timer
Wait for
call 0from
above
Wait
for
ACK1
rdt_rcv(rcvpkt)
L
L
L
236. Performance of rdt3.0
3-
236
rdt3.0 works, but performance stinks
example: 1 Gbps link, 15 ms e-e prop. delay, 1KB
packet:
T
transmit
L (packet length in bits)
R (transmission rate, bps)
= 8kb/pkt
10**9 b/sec
= 8 microsec
U
sender
=
=
.008
30.008
= 0.00027
microsec
onds
L / R
RTT + L / R
=
U sender: utilization – fraction of time sender busy sending
1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps
link
network protocol limits use of physical resources!
237. rdt3.0: stop-and-wait operation
3-
237
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
RTT
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
U
sender
=
.008
30.008
= 0.00027
microsec
onds
L / R
RTT + L / R
=
238. Pipelined protocols
238
Pipelining: sender allows multiple, “in-flight”,
yet-to-be-acknowledged pkts
range of sequence numbers must be increased
buffering at sender and/or receiver
Two generic forms of pipelined protocols: go-
Back-N, selective repeat
239. Pipelining: increased utilization
239
first packet bit transmitted, t = 0
sender receiver
last bit transmitted, t = L / R
RTT
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
U
sender
=
.024
30.008
= 0.0008
microsecon
ds
3 * L / R
RTT + L / R
=
Increase utilization
by a factor of 3!
240. DON’T FALL ASLEEP !!!!!
Go-Back-N (sliding window protocol)
3-
240
Sender:
(For now, treat seq # as unlimited)
k-bit seq # in pkt header
“window” of up to N, consecutive unack’ed pkts allowed
ACK(n): ACKs all pkts up to, including seq # n - “cumulative
ACK”
Sender may receive duplicate ACKs (see receiver)
timer for each in-flight pkt
timeout(n): retransmit pkt n and all higher seq # pkts in window
Q: what happen when a receiver is totally disconnected? MAX RETRY
241. GBN: sender extended FSM
3-
241
L Buffer data or block higher app.
Wait
timeout
start_timer
udt_send(sndpkt[base])
udt_send(sndpkt[base+1])
…
udt_send(sndpkt[nextseqnum-1])
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
else
refuse_data(data)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
base=1
nextseqnum=1
rdt_rcv(rcvpkt)
&& corrupt(rcvpkt)
No pkt in pipe
Reset timer
242. GBN: receiver extended FSM
242
default
udt_send(sndpkt)
Wait
rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
&& hasseqnum(rcvpkt,expectedseqnum)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
expectedseqnum=1
sndpkt =
make_pkt(expectedseqnum,ACK,chksum)
ACK-only: always send ACK for correctly-received
pkt with highest in-order seq #
may generate duplicate ACKs
need only remember expectedseqnum
out-of-order pkt:
discard (don’t buffer) -> no receiver buffering!
Re-ACK pkt with highest in-order seq #
L
If in order pkt is
received, deliver
to app and ack!
Else, just drop it!
243. GBN
in action
3-
243
Window size=N=4
What determine the
size of window?
1. RTT
2. Buffer at the
receiver(flow
control)
3. Network congestion
Q: GBN has poor performance. How?
Sender sends pkt 1,2,3,4,5,6,7,8,9..
pkt 1 got lost, receiver got pkt 2,3,4,5,… but will discard them!
244. Selective Repeat (improvement of the
GBN Protocol)
244
receiver individually acknowledges all
correctly received pkts
buffers pkts, as needed, for eventual in-order
delivery to upper layer
E.g., sender: pkt 1,2,3,4,….,10; receiver got
2,4,6,8,10. Sender resends 1,3,5,7,9.
sender only resends pkts for which ACK not
received
sender timer for EACH unACKed pkt
sender window
N consecutive seq #’s
again limits seq #s of sent, unACKed pkts
245. Selective repeat: sender, receiver windows
245
Q: why we have this?
Ack is lost or ack
is on its way
246. Selective repeat
3-
246
sender
data from above :
if next available seq # in
window, send pkt
timeout(n) for pkt n:
resend pkt n, restart timer
ACK(n) in
[sendbase,sendbase+N]:
mark pkt n as received
if n smallest unACKed pkt,
advance window base to
next unACKed seq #
receiver
pkt n in [rcvbase, rcvbase+N-1]
send ACK(n)
out-of-order: buffer
in-order: deliver (also
deliver buffered, in-order
pkts), advance window to
next not-yet-received pkt
pkt n in [rcvbase-N,rcvbase-1]
ACK(n)
otherwise:
ignore
(slide the window)
Q: why we need this?
The ack got lost.
Sender may
timeout, resend pkt,
we need to ack
247. Selective repeat in action (N=4)
3-
247
Under GBN, this
pkt will be
dropped.
248. 3-
248
Selective repeat:
dilemma
In real life, we use k-bits to
implement seq #. Practical issue:
Example:
seq #’s: 0, 1, 2, 3
window size (N)=3
receiver sees no
difference in two
scenarios!
incorrectly passes
duplicate data as new in
(a)
Q: what relationship
between seq # size and
window size?
N <= 2^k/2
249. Why bother study reliable data transfer?
3-
249
We know it is provided by TCP, so why bother to
study?
Sometimes, we may need to implement “some
form” of reliable transfer without the heavy duty
TCP.
A good example is multimedia streaming. Even
though the application is loss tolerant, but if too
many packets got lost, it affects the visual quality.
So we may want to implement some for of reliable
transfer.
At the very least, appreciate the “good services”
provided by some Internet gurus.
250. TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
(The 800 lbs gorilla in the transport stack! PAY ATTENTION!!)
3-
250
full duplex data:
bi-directional data flow in
same connection
MSS: maximum segment
size
connection-oriented:
handshaking (exchange of
control msgs) init’s
sender, receiver state
(e.g., buffer size) before
data exchange
flow controlled:
sender will not overwhelm
receiver
point-to-point:
one sender, one receiver
(not multicast)
reliable, in-order byte
steam:
no “message boundaries”
In App layer, we need
delimiters.
pipelined:
TCP congestion and flow
control set window size
send & receive buffers
socket
door
TCP
send buffer
TCP
receive buffer
socket
door
segment
application
writes data
application
reads data
251. TCP segment structure
3-
251
32 bits
source port # dest port #
sequence number
acknowledgement number
Receive window
UAPR S F
checksum Urg data pnter
application
data
(variable length)
head
len
not
used
Options (variable length)
URG: urgent data
(generally not used)
ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection estab
(setup, teardown
commands)
counting
by “bytes”
of data
(not segments!)
# bytes
rcvr willing
to accept
Internet
checksum
(as in UDP)
Due to this
field we have a
variable length
header
252. Negotiate
during 3-way
handshake
TCP seq. #’s and ACKs
3-
252
Seq. #’s:
byte stream
“number” of first
byte in segment’s
data
ACKs:
seq # of next byte
expected from other
side
cumulative ACK
Q: how receiver handles
out-of-order segments
A: TCP spec doesn’t
say, - up to
implementor
Host A Host B
User
types
‘C’
host ACKs
receipt
of echoed
‘C’
host ACKs
receipt of
‘C’, echoes
back ‘C’
time
simple telnet scenario
253. TCP Round Trip Time and Timeout
3-
253
Q: how to set TCP
timeout value?
longer than RTT
but RTT varies
too short: premature
timeout
unnecessary
retransmissions
too long: slow reaction
to segment loss, poor
performance.
Q: how to estimate RTT?
SampleRTT: measured time from
segment transmission until ACK
receipt
ignore retransmissions
SampleRTT will vary, want
estimated RTT “smoother”
average several recent
measurements, not just current
SampleRTT
tx
retx
ack
Estimated
RTT
tx
retx
ack
Estimated
RTT
Too long
Too short
256. TCP Round Trip Time and Timeout
3-
256
Setting the timeout (by Jacobson/Karel)
EstimtedRTT plus “safety margin”
large variation in EstimatedRTT -> larger safety margin
first estimate of how much SampleRTT deviates from
EstimatedRTT:
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically, = 0.25)
Then set timeout interval:
TimeoutInterval = EstimatedRTT + 4*DevRTT
257. TCP reliable data transfer
3-
257
TCP creates rdt
service on top of IP’s
unreliable service
Pipelined segments
(for performance)
Cumulative acks
TCP uses single
retransmission timer
Retransmissions are
triggered by:
timeout events
duplicate ack ( for
performance reason)
Initially consider
simplified TCP
sender:
ignore duplicate acks
ignore flow control,
congestion control
258. TCP sender events:
3-
258
data rcvd from app:
Create segment with
seq #
seq # is byte-stream
number of first data
byte in segment
start timer if not
already running (think
of timer as for oldest
unacked segment)
expiration interval:
TimeOutInterval
timeout:
retransmit segment
that caused timeout
restart timer
Ack rcvd:
If acknowledges
previously unacked
segments
update what is known
to be acked
start timer if there are
outstanding segments
259. 3-
259
TCP
sender
(simplified)
NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum
loop (forever) {
switch(event)
event: data received from application above
create TCP segment with sequence number NextSeqNum
if (timer currently not running)
start timer
pass segment to IP
NextSeqNum = NextSeqNum + length(data)
event: timer timeout
retransmit not-yet-acknowledged segment with
smallest sequence number
start timer
event: ACK received, with ACK field value of y
if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}
} /* end of loop forever */
Comment:
• SendBase-1: last
cumulatively
ack’ed byte
Example:
• SendBase-1 = 71;
y= 73, so the rcvr
wants 73+ ;
y > SendBase, so
that new data is
acked
260. TCP: retransmission scenarios
3-
260
Host A
time
Host B
premature timeout
Seq=92 timeout
Host A
X
loss
timeout
Host B
lost ACK scenario
time
Seq=92 timeout
SendBase
= 100
Sendbase
= 100
SendBase
= 120
SendBase
= 120
261. TCP retransmission scenarios (more)
3-
261
Host A
X
loss
timeout
Host B
Cumulative ACK scenario
time
SendBase
= 120
Room for improvement
262. TCP ACK generation [RFC 1122, RFC 2581]
3-
262
Event at Receiver
Arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
Arrival of in-order segment with
expected seq #. One other
segment has ACK pending
Arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
Arrival of segment that
partially or completely fills gap
TCP Receiver action
Delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
Immediately send single cumulative
ACK, ACKing both in-order segments
Ack the “largest in-order byte” seq #
Immediately send duplicate ACK,
indicating seq. # of next expected byte
Immediate send ACK, provided that
segment startsat lower end of gap
263. Fast Retransmit
3-
263
Time-out period
often relatively long:
long delay before
resending lost packet
Detect lost segments
via duplicate ACKs.
Sender often sends
many segments back-to-
back
If segment is lost,
there will likely be
many duplicate ACKs.
If sender receives 3
ACKs for the same
data, it supposes that
segment after ACKed
data was lost:
fast retransmit: resend
segment before timer
expires
timeout
264. Fast retransmit algorithm:
3-
264
event: ACK received, with ACK field value of y
if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}
else {
increment count of dup ACKs received for y
if (count of dup ACKs received for y = 3) {
resend segment with sequence number y
}
a duplicate ACK for
already ACKed segment
fast retransmit
Q: why resend pkt
with seq # y?
A: That is what the
receiver expect!
265. TCP Flow Control
3-
265
receive side of TCP
connection has a
receive buffer:
flow control
speed-matching
service: matching
the send rate to the
receiving app’s drain
rate
app process may be
slow at reading from
buffer
sender won’t overflow
receiver’s buffer by
transmitting too
much,
too fast
266. TCP Flow control: how it works
3-
266
(Suppose TCP receiver
discards out-of-order
segments)
spare room in buffer
= RcvWindow
= RcvBuffer-[LastByteRcvd -
LastByteRead]
Rcvr advertises spare
room by including
value of RcvWindow in
segments
Sender limits
unACKed data to
RcvWindow
guarantees receive
buffer doesn’t overflow
This goes to show that the design
process of header is important!!
267. TCP Connection Management
3-
267
Recall: TCP sender,
receiver establish
“connection” before
exchanging data segments
initialize TCP variables:
seq. #s
buffers, flow control
info (e.g. RcvWindow)
client: connection initiator
Socket clientSocket = new
Socket("hostname","port
number");
server: contacted by client
Socket connectionSocket =
welcomeSocket.accept();
Three way handshake:
Step 1: client host sends TCP SYN
segment to server
specifies initial seq #
no data
Step 2: server host receives SYN,
replies with SYN-ACK segment
server allocates buffers
specifies server initial seq. #
Step 3: client receives SYN-ACK,
replies with ACK segment,
which may contain data
269. TCP Connection Management (cont.)
3-
269
Closing a connection:
client closes socket:
clientSocket.close();
Step 1: client end system
sends TCP FIN control
segment to server
Step 2: server receives FIN,
replies with ACK. Closes
connection, sends FIN.
client server
close
close
timed wait
closed
Q: why don’t we
combine ACK and
FIN?
Sender may have
some data in the
pipeline!
270. TCP Connection Management (cont.)
3-
270
Step 3: client receives FIN,
replies with ACK.
Enters “timed wait” - will
respond with ACK to
received FINs
Step 4: server, receives ACK.
Connection closed.
Note: with small modification,
can handle simultaneous FINs.
client server
closing
closing
timed wait
closed
closed