Wwwwww

Data Communication and
Computer Networks
SISAY TADESSE (MSC. IN
TELECOMMUNICATION ENGINEERING)

Outline
 Networking fundamentals
 Physical Layer
 Link Layer
 Network layer
 Transport layer
 Application layer

CCNA 1 v3.1 Module 2
Networking Fundamentals

Metropolitan-Area Network (MANs)

Virtual Private Networks (VPNs)

Bandwidth Limitations (Know this stuff)

Using Layers to Analyze Problems

Using Layers to Describe Data Communication

OSI Model (Important to know)
Away
Pizza
Sausage
Throw
Not
Do
Programmers

OSI Layers
•Provides connectivity and path selection between two host
•Provides Logical address
•No error correction, best effort delivery.

Encapsulation
Data
Data
Data
Segments
Packet
Frame
Bits

Transmission Media
41
Physical Layer
(Transmission Media)

Transmission Media
Transmission Media
42
Transmission medium:: the physical path
between transmitter and receiver.
 Repeaters or amplifiers may be used to extend
the length of the medium.
 Communication of electromagnetic waves is
guided or unguided.
Guided media :: waves are guided along a physical
path (e.g, twisted pair, coaxial cable and optical
fiber).
Unguided media:: means for transmitting but not
guiding electromagnetic waves (e.g., the atmosphere
and outer space).

Transmission Media
Transmission Media Choices
43
 Twisted pair
 Coaxial cable
 Optical fiber
 Wireless communications

Digital Transmission Media Bit Rates
Transmission Media 44

Transmission Media
Twisted Pair
45
 Two insulated wires arranged in a spiral pattern
 Copper or steel coated with copper
 The signal is transmitted through one wire and a
ground reference is transmitted in the other wire.
 Typically twisted pair is installed in building
telephone wiring.
 Local loop connection to central telephone
exchange is twisted pair.

Transmission Media
Twisted Pair
46
 Limited in distance, bandwidth and data rate due
to problems with attenuation, interference and
noise
 Issue: cross-talk due to interference from other signals
 “shielding” wire (shielded twisted pair (STP)) with
metallic braid or sheathing reduces interference.
 “twisting” reduces low-frequency interference and
crosstalk.

UTP (Unshielded Twisted Pair)
Category 3 corresponds to ordinary voice-grade twisted pair
found in abundance in most office buildings.
Category 5 (used for Fast Ethernet) is much more tightly
twisted.

Digital Subscriber Line (DSL) [LG&W
Transmission Media
48
p.137]
Telphone companies originally transmitted within
the 0 to 4kHZ range to reduce crosstalk. Loading
coils were added within the subscriber loop to
provide a flatter transfer function to further
improve voice transmission within the 3kHZ band
while increasing attenuation at the higher
frequencies.
ADSL (Asymmetric Digital Subscriber Line)
 Uses existing twisted pair lines to provide higher
bit rates that are possible with unloaded twisted
pairs (i.e., no loading coils on subscriber loop.)

Transmission Media
DSL
49
the network transmits downstream at speeds
ranging from 1.536 Mbps to 6.144Mbps
asymmetric
bidirectional
digital transmisssions
users transmit upstream at speeds ranging
[higher frequencies] from 64 kbps to 640 kbps
0 to 4kHZ used for conventional analog telephone signals

Transmission Media
DSL
50
 ITU-T G992.1 ADSL standard uses Discrete
Multitone (DMT) that divides the bandwidth into
a large number of small subchannels.
 A splitter is required to separate voice signals
from the data signal.
 The binary information is distributed among the
subchannels. Each subchannel uses QAM.
 DMT adapts to line conditions by avoiding
subchannels with poor SNR.

10BASE-T
10 Mbps baseband transmission over twisted pair.
Two Cat 3 cables, Manchester encoding,
Maximum distance - 100 meters
Ethernet hub
     
Copyright ©2000 The McGraw Hill Companies Leon-Garcia & Widjaja: Communication Networks Figure 3.38

Transmission Media
Coaxial Cable
52
Center
conductor
Dielectric
material
Braided
outer
conduct
or
Outer
cover

Transmission Media
Coaxial Cable
53
 Discussion divided into two basic categories
for coax used in LANs:
 50-ohm cable [baseband]
 75-ohm cable [broadband or single channel
baseband]
 In general, coax has better noise immunity for
higher frequencies than twisted pair.
 Coaxial cable provides much higher
bandwidth than twisted pair.
 However, cable is ‘bulky’.

Transmission Media
Baseband Coax
54
 50-ohm cable is used exclusively for digital
transmissions
 Uses Manchester encoding, geographical limit is a
few kilometers.
10Base5 Thick Ethernet :: thick (10 mm) coax
10 Mbps, 500 m. max segment length, 100
devices/segment, awkward to handle and install.
10Base2 Thin Ethernet :: thin (5 mm) coax
10 Mbps, 185 m. max segment length, 30
devices/segment, easier to handle, uses T-shaped
connectors.

Transmission Media
Broadband Coax
55
 75-ohm cable (CATV system standard)
 Used for both analog and digital signaling.
 Analog signaling – frequencies up to 500 MHZ are
possible.
 When FDM used, referred to as broadband.
 For long-distance transmission of analog signals,
amplifiers are needed every few kilometers.

Transmission Media
Hybrid Fiber-Coaxial System
56
Hea
d
end
Upstream fiber
Downstream fiber
Fiber
node
Coaxial
distribution
plant
Fiber
node
Bidirectional
Split-Band
Amplifier
Fiber Fiber
Leon-Garcia & Widjaja: Communication Networks Figure 3.42 Copyright ©2000 The McGraw Hill Companies

Transmission Media
Optical Fiber
57
 Optical fiber :: a thin flexible medium capable of
conducting optical rays. Optical fiber consists of a
very fine cylinder of glass (core) surrounded by
concentric layers of glass (cladding).
 a signal-encoded beam of light (a fluctuating
beam) is transmitted by total internal reflection.
 Total internal reflection occurs in the core
because it has a higher optical density (index of
refraction) than the cladding.
 Attenuation in the fiber can be kept low by
controlling the impurities in the glass.

(a) Geometry of optical fiber
(b) Reflection in optical fiber
Transmission Media
58
core
cladding jacket
light
c
Figure 3.44
Optical Fiber
Copyright ©2000 The McGraw Hill Companies Leon-Garcia & Widjaja: Communication Networks

Transmission Media
Optical Fiber
59
 Lowest signal losses are for ultrapure fused silica –
but this is hard to manufacture.
 Optical fiber acts as a wavelength guide for
frequencies in the range 10 **14 to 10 **15 HZ which
covers the visible and part of the infrared spectrum.
 Three standard wavelengths : 850 nanometers (nm.),
1300 nm, 1500 nm.
 First-generation optical fiber :: 850 nm, 10’s Mbps
using LED (light-emitting diode) sources.
 Second and third generation optical fiber :: 1300 and
1500 nm using ILD (injection laser diode) sources,
gigabits/sec.

Transmission Media
Optical Fiber
60
 Attenuation loss is lower at higher wavelengths.
 There are two types of detectors used at the
receiving end to convert light into electrical
energy (photo diodes):
 PIN detectors – less expensive, less sensitive
 APD detectors
 ASK is commonly used to transmit digital data
over optical fiber {referred to as intensity
modulation}.

Transmission Media
Optica6l1 Fiber
 Three techniques:
 Multimode step-index
 Multimode graded-index
 Single-mode step-index
 Presence of multiple paths  differences in
delay  optical rays interfere with each other.
 A narrow core can create a single direct path
which yields higher speeds.
 WDM (Wavelength Division Multiplexing)
yields more available capacity.

(a) Multimode fiber: multiple rays follow different paths
Transmission Media
62
(b) Single mode: only direct path propagates in fiber
reflected path
direct path

Transmission Media
Frequency
63
(Hz)
104 106 107 108 109 1010 1011 1012
103 102 101 1 10-1 10-2 10-3
Wavelength (meters)
105
satellite & terrestrial
microwave
AM radio
FM radio & TV
LF MF HF VHF UHF SHF EHF
10
4
Cellular
& PCS
Wireless cable

5: DataLink Layer
Link Layer: Introduction
65
Some terminology:
 hosts and routers are nodes
(bridges and switches too)
 communication channels
that connect adjacent nodes
along communication path
are links
 wired links
 wireless links
 LANs
 2-PDU is a frame,
encapsulates datagram
“link”
data-link layer has responsibility of
transferring datagram from one node
to adjacent node over a link

5: DataLink Layer
Link layer: context
66
 Datagram transferred
by different link
protocols over
different links:
 e.g., Ethernet on first
link, frame relay on
intermediate links, 802.11
on last link
 Each link protocol
provides different
services
 e.g., may or may not
provide rdt over link
transportation analogy
 trip from Princeton to
Lausanne
 limo: Princeton to JFK
 plane: JFK to Geneva
 train: Geneva to Lausanne
 tourist = datagram
 transport segment =
communication link
 transportation mode =
link layer protocol
 travel agent = routing
algorithm

5: DataLink Layer
Link Layer Services
5a-
67
 Framing, link access:
 encapsulate datagram into frame, adding header,
trailer
 channel access if shared medium
 ‘physical addresses’ used in frame headers to identify
source, dest
 different from IP address!
 Reliable delivery between adjacent nodes
 we learned how to do this already (chapter 3)!
 seldom used on low bit error link (fiber, some twisted
pair)
 wireless links: high error rates
 Q: why both link-level and end-end reliability?

5: DataLink Layer
Link Layer Services (more)
5a-
68
 Flow Control:
 pacing between adjacent sending and receiving nodes
 Error Detection:
 errors caused by signal attenuation, noise.
 receiver detects presence of errors:
 signals sender for retransmission or drops frame
 Error Correction:
 receiver identifies and corrects bit error(s) without
resorting to retransmission
 Half-duplex and full-duplex
 with half duplex, nodes at both ends of link can transmit,
but not at same time

5: DataLink Layer
Adaptors Communicating
5a-
69
link layer protocol
 link layer implemented
in “adaptor” (aka NIC)
 Ethernet card, PCMCI
card, 802.11 card
 sending side:
 encapsulates datagram in
a frame
 adds error checking bits,
rdt, flow control, etc.
 receiving side
 looks for errors, rdt, flow
control, etc
 extracts datagram,
passes to rcving node
 adapter is semi-autonomous
 link & physical layers
sending
node
frame
rcving
node
datagram
frame
adapter adapter

5: DataLink Layer
Error Detection
5a-
70
EDC= Error Detection and Correction bits (redundancy)
D = Data protected by error checking, may include header
fields
• Error detection not 100% reliable!
• protocol may miss some errors, but rarely
• larger EDC field yields better detection and correction

Parity Checking
5: DataLink Layer
5a-
71
Single Bit
Parity:
Detect single bit errors
Two Dimensional Bit Parity:
Detect and correct single bit errors
0 0

Goal: detect “errors” (e.g., flipped bits) in
transmitted segment (note: used at transport
layer only)
5: DataLink Layer
Internet checksum
5a-
72
Sender:
 treat segment contents
as sequence of 16-bit
integers
 checksum: addition (1’s
complement sum) of
segment contents
 sender puts checksum
value into UDP
checksum field
Receiver:
 compute checksum of
received segment
 check if computed checksum
equals checksum field value:
 NO - error detected
 YES - no error detected. But
maybe errors nonetheless?
More later ….

Checksumming: Cyclic Redundancy
5: DataLink Layer
Check
5a-
73
 view data bits, D, as a binary number
 choose r+1 bit pattern (generator), G
 goal: choose r CRC bits, R, such that
 <D,R> exactly divisible by G (modulo 2)
 receiver knows G, divides <D,R> by G. If non-zero
remainder: error detected!
 can detect all burst errors less than r+1 bits
 widely used in practice (ATM, HDCL)

5: DataLink Layer
CRC Example
5a-
74
Want:
D.2r XOR R = nG
equivalently:
D.2r = nG XOR R
equivalently:
if we divide D.2r by
G, want remainder
R
D.2r
G
R= remainder[ ]

Multiple Access Links and Protocols
Two types of “links”:
5: DataLink Layer
5a-
75
 point-to-point
 PPP for dial-up access
 point-to-point link between Ethernet switch and host
 broadcast (shared wire or medium)
 traditional Ethernet
 upstream HFC
 802.11 wireless LAN

5: DataLink Layer
Multiple Access protocols
5a-
76
 single shared broadcast channel
 two or more simultaneous transmissions by
nodes: interference
 only one node can send successfully at a time
multiple access protocol
 distributed algorithm that determines how nodes
share channel, i.e., determine when node can
transmit
 communication about channel sharing must use
channel itself!
 what to look for in multiple access protocols:

Ideal Mulitple Access Protocol
5: DataLink Layer
5a-
77
Broadcast channel of rate R bps
1. When one node wants to transmit, it can send at
rate R.
2. When M nodes want to transmit, each can send at
average rate R/M
3. Fully decentralized:
 no special node to coordinate transmissions
 no synchronization of clocks, slots
4. Simple

5: DataLink Layer
MAC Protocols: a taxonomy
5a-
78
Three broad classes:
 Channel Partitioning
 divide channel into smaller “pieces” (time slots,
frequency, code)
 allocate piece to node for exclusive use
 Random Access
 channel not divided, allow collisions
 “recover” from collisions
 “Taking turns”
 tightly coordinate shared access to avoid collisions

Channel Partitioning MAC protocols: TDMA
5: DataLink Layer
5a-
79
TDMA: time division multiple access
 access to channel in "rounds"
 each station gets fixed length slot (length = pkt trans time) in each round
 unused slots go idle
 example: 6-station LAN, 1,3,4 have pkt, slots 2,5,6 idle
 TDM (Time Division Multiplexing): channel divided into N time
slots, one per user; inefficient with low duty cycle users and at
light load.
 FDM (Frequency Division Multiplexing): frequency subdivided.

Channel Partitioning MAC protocols:
5: DataLink Layer
FDMA
5a-
80
FDMA: frequency division multiple access
 channel spectrum divided into frequency bands
 each station assigned fixed frequency band
 unused transmission time in frequency bands go idle
 example: 6-station LAN, 1,3,4 have pkt, frequency bands 2,5,6 idle
frequency bands
 TDM (Time Division Multiplexing): channel divided into N time slots,
one per user; inefficient with low duty cycle users and at light load.
 FDM (Frequency Division Multiplexing): frequency subdivided.

Channel Partitioning (CDMA)
5: DataLink Layer
5a-
81
CDMA (Code Division Multiple Access)
 unique “code” assigned to each user; i.e., code set
partitioning
 used mostly in wireless broadcast channels (cellular,
satellite, etc)
 all users share same frequency, but each user has own
“chipping” sequence (i.e., code) to encode data
 encoded signal = (original data) X (chipping sequence)
 decoding: inner-product of encoded signal and chipping
sequence
 allows multiple users to “coexist” and transmit
simultaneously with minimal interference (if codes are
“orthogonal”)

5: DataLink Layer
CDMA Encode/Decode
5a-
82

CDMA: two-sender interference
5: DataLink Layer
5a-
83

5: DataLink Layer
Random Access Protocols
5a-
84
 When node has packet to send
 transmit at full channel data rate R.
 no a priori coordination among nodes
 two or more transmitting nodes -> “collision”,
 random access MAC protocol specifies:
 how to detect collisions
 how to recover from collisions (e.g., via delayed
retransmissions)
 Examples of random access MAC protocols:
 slotted ALOHA
 ALOHA
 CSMA, CSMA/CD, CSMA/CA

5: DataLink Layer
Slotted ALOHA
5a-
85
Assumptions
 all frames same size
 time is divided into
equal size slots, time to
transmit 1 frame
 nodes start to transmit
frames only at
beginning of slots
 nodes are synchronized
 if 2 or more nodes
transmit in slot, all
nodes detect collision
Operation
 when node obtains fresh
frame, it transmits in
next slot
 no collision, node can
send new frame in next
slot
 if collision, node
retransmits frame in
each subsequent slot
with prob. p until
success

5: DataLink Layer
Slotted ALOHA
5a-
86
Pros
 single active node can
continuously transmit
at full rate of channel
 highly decentralized:
only slots in nodes
need to be in sync
 simple
Cons
 collisions, wasting
slots
 idle slots
 nodes may be able to
detect collision in less
than time to transmit
packet

Slotted Aloha efficiency
5: DataLink Layer
5a-
87
 Suppose N nodes with
many frames to send, each
transmits in slot with
probability p
 prob that 1st node has
success in a slot = p(1-
p)N-1
 prob that any node has a
success = Np(1-p)N-1
 For max efficiency
with N nodes, find p*
that maximizes
Np(1-p)N-1
 For many nodes,
take limit of Np*(1-
p*)N-1 as N goes to
infinity, gives 1/e =
.37
Efficiency is the long-run
fraction of successful slots
when there’s many nodes,
each
with many frames to send
At best: channel
used for useful
transmissions 37%
of time!

5: DataLink Layer
Pure (unslotted) ALOHA
5a-
88
 unslotted Aloha: simpler, no synchronization
 when frame first arrives
 transmit immediately
 collision probability increases:
 frame sent at t0 collides with other frames sent in [t0-
1,t0+1]

5: DataLink Layer
Pure Aloha efficiency
5a-
89
P(success by given node) = P(node transmits) .
P(no other node transmits in [p0-1,p0] .
P(no other node transmits in [p0-1,p0]
= p . (1-p)N-1 . (1-p)N-1
= p . (1-p)2(N-1)
… choosing optimum p and then letting n -> infty ...
= 1/(2e) = .18
Even worse !

CSMA (Carrier Sense Multiple Access)
5: DataLink Layer
5a-
90
CSMA: listen before transmit:
 If channel sensed idle: transmit entire frame
 If channel sensed busy, defer transmission
 Human analogy: don’t interrupt others!

5: DataLink Layer
CSMA collisions
5a-
91
collisions can still
occur:
propagation delay means
two nodes may not hear
each other’s transmission
collision:
entire packet
transmission
time wasted
spatial layout of nodes
note:
role of distance & propagation
delay in determining collision
probability

CSMA/CD (Collision Detection)
5: DataLink Layer
5a-
92
CSMA/CD: carrier sensing, deferral as in
CSMA
 collisions detected within short time
 colliding transmissions aborted, reducing channel
wastage
 collision detection:
 easy in wired LANs: measure signal strengths,
compare transmitted, received signals
 difficult in wireless LANs: receiver shut off while
transmitting
 human analogy: the polite conversationalist

CSMA/CD collision detection
5: DataLink Layer
5a-
93

“Taking Turns” MAC protocols
5: DataLink Layer
5a-
94
channel partitioning MAC protocols:
 share channel efficiently and fairly at high load
 inefficient at low load: delay in channel access, 1/N
bandwidth allocated even if only 1 active node!
Random access MAC protocols
 efficient at low load: single node can fully utilize
channel
 high load: collision overhead
“taking turns” protocols
look for best of both worlds!

“Taking Turns” MAC protocols
5: DataLink Layer
5a-
95
Polling:
 master node
“invites” slave
nodes to transmit
in turn
 concerns:
 polling overhead
 latency
 single point of
failure (master)
Token passing:
 control token passed
from one node to next
sequentially.
 token message
 concerns:
 token overhead
 latency
 single point of failure
(token)

5: DataLink Layer
Summary of MAC protocols
5a-
96
 What do you do with a shared media?
 Channel Partitioning, by time, frequency or code
 Time Division,Code Division, Frequency Division
 Random partitioning (dynamic),
 ALOHA, S-ALOHA, CSMA, CSMA/CD
 carrier sensing: easy in some technologies (wire), hard in
others (wireless)
 CSMA/CD used in Ethernet
 Taking Turns
 polling from a central site, token passing

5: DataLink Layer
LAN technologies
5a-
97
Data link layer so far:
 services, error detection/correction, multiple
access
Next: LAN technologies
 addressing
 Ethernet
 hubs, bridges, switches
 802.11
 PPP
 ATM

5: DataLink Layer
LAN Addresses and ARP
5a-
98
32-bit IP address:
 network-layer address
 used to get datagram to destination IP network (recall IP
network definition)
LAN (or MAC or physical or Ethernet) address:
 used to get datagram from one interface to another
physically-connected interface (same network)
 48 bit MAC address (for most LANs)
burned in the adapter ROM

5: DataLink Layer
LAN Addresses and ARP
5a-
99
Each adapter on LAN has unique LAN address

5: DataLink Layer
LAN Address (more)
5a-
100
 MAC address allocation administered by IEEE
 manufacturer buys portion of MAC address space (to assure
uniqueness)
 Analogy:
(a) MAC address: like Social Security Number
(b) IP address: like postal address
 MAC flat address => portability
 can move LAN card from one LAN to another
 IP hierarchical address NOT portable
 depends on IP network to which node is attached

5: DataLink Layer
Recall earlier routing discussion
5a-
101
223.1.1.1
223.1.1.2
223.1.2.1
223.1.1.4 223.1.2.9
223.1.1.3
223.1.2.2
223.1.3.27
223.1.3.1 223.1.3.2
A
B
E
Starting at A, given IP
datagram addressed to B:
 look up net. address of B, find B
on same net. as A
 link layer send datagram to B
inside link-layer frame
B’s MAC
addr
A’s MAC
addr
A’s IP
addr
B’s IP
addr
IP payload
datagram
frame
frame source,
dest address
datagram source,
dest address

ARP: Address Resolution Protocol
5: DataLink Layer
5a-
102
 Each IP node (Host,
Router) on LAN has ARP
table
 ARP Table: IP/MAC
address mappings for
some LAN nodes
< IP address; MAC address; TTL>
 TTL (Time To Live): time
after which address mapping
will be forgotten (typically
20 min)
Question: how to determine
MAC address of B
knowing B’s IP address?

5: DataLink Layer
ARP protocol
5a-
103
 A wants to send datagram to B,
and A knows B’s IP address.
 Suppose B’s MAC address is
not in A’s ARP table.
 A broadcasts ARP query packet,
containing B's IP address
 all machines on LAN receive
ARP query
 B receives ARP packet, replies
to A with its (B's) MAC address
 frame sent to A’s MAC address
(unicast)
 A caches (saves) IP-to-MAC
address pair in its ARP table
until information becomes old
(times out)
 soft state: information that
times out (goes away) unless
refreshed
 ARP is “plug-and-play”:
 nodes create their ARP tables
without intervention from
net administrator

A
5: DataLink Layer
Routing to another LAN
5a-
104
walkthrough: send datagram from A to B via R
assume A know’s B IP address
 Two ARP tables in router R, one for each IP network (LAN)
R
 In routing table at source Host, find router 111.111.111.110
 In ARP table at source, find MAC address E6-E9-00-17-BB-4B, etc
B

 A creates datagram with source A, destination B
 A uses ARP to get R’s MAC address for 111.111.111.110
 A creates link-layer frame with R's MAC address as dest, frame
5: DataLink Layer
5a-
105
contains A-to-B IP datagram
 A’s data link layer sends frame
 R’s data link layer receives frame
 R removes IP datagram from Ethernet frame, sees its destined to
B
 R uses ARP to get B’s physical layer address
 R creates frame containing A-to-B IP datagram sends to B
A
R
B

5: DataLink Layer
Ethernet
5a-
106
“dominant” LAN technology:
 cheap $20 for 100Mbs!
 first widely used LAN technology
 Simpler, cheaper than token LANs and ATM
 Kept up with speed race: 10, 100, 1000 Mbps
Metcalfe’s Ethernet
sketch

5: DataLink Layer
Ethernet Frame Structure
5a-
107
Sending adapter encapsulates IP datagram (or other
network layer protocol packet) in Ethernet frame
Preamble:
 7 bytes with pattern 10101010 followed by one byte
with pattern 10101011
 used to synchronize receiver, sender clock rates

Ethernet Frame Structure (more)
5: DataLink Layer
5a-
108
 Addresses: 6 bytes
 if adapter receives frame with matching destination address, or with
broadcast address (eg ARP packet), it passes data in frame to net-layer
protocol
 otherwise, adapter discards frame
 Type: indicates the higher layer protocol, mostly IP but
others may be supported such as Novell IPX and
AppleTalk)
 CRC: checked at receiver, if error is detected, the frame is
simply dropped

Unreliable, connectionless service
5: DataLink Layer
5a-
109
 Connectionless: No handshaking between sending and
receiving adapter.
 Unreliable: receiving adapter doesn’t send acks or nacks to
sending adapter
 stream of datagrams passed to network layer can have gaps
 gaps will be filled if app is using TCP
 otherwise, app will see the gaps

5: DataLink Layer
Ethernet uses CSMA/CD
5a-
110
 No slots
 adapter doesn’t transmit if
it senses that some other
adapter is transmitting, that
is, carrier sense
 transmitting adapter aborts
when it senses that another
adapter is transmitting, that
is, collision detection
 Before attempting a
retransmission, adapter
waits a random time,
that is, random access

5: DataLink Layer
Ethernet CSMA/CD algorithm
5a-
111
1. Adaptor gets datagram from
and creates frame
2. If adapter senses channel
idle, it starts to transmit
frame. If it senses channel
busy, waits until channel
idle and then transmits
3. If adapter transmits entire
frame without detecting
another transmission, the
adapter is done with frame !
4. If adapter detects another
transmission while
transmitting, aborts and
sends jam signal
5. After aborting, adapter
enters exponential
backoff: after the mth
collision, adapter chooses a
K at random from
{0,1,2,…,2m-1}. Adapter
waits K*512 bit times and
returns to Step 2

5: DataLink Layer
Ethernet’s CSMA/CD (more)
5a-
112
Jam Signal: make sure all other
transmitters are aware of
collision; 48 bits;
Bit time: .1 microsec for 10
Mbps Ethernet ;
for K=1023, wait time is
about 50 msec
Exponential Backoff:
 Goal: adapt retransmission
attempts to estimated current
load
 heavy load: random wait will
be longer
 first collision: choose K from
{0,1}; delay is K x 512 bit
transmission times
 after second collision: choose
K from {0,1,2,3}…
 after ten collisions, choose K
from {0,1,2,3,4,…,1023}
See/interact with Java
applet on AWL Web site:
highly recommended !

5: DataLink Layer
CSMA/CD efficiency
5a-
113
 Tprop = max prop between 2 nodes in LAN
 ttrans = time to transmit max-size frame
 Efficiency goes to 1 as tprop goes to 0
 Goes to 1 as ttrans goes to infinity
 Much better than ALOHA, but still decentralized, simple, and cheap
1
prop trans 1 5t / t
efficiency



Ethernet Technologies: 10Base2
5: DataLink Layer
5a-
114
 10: 10Mbps; 2: under 200 meters max cable length
 thin coaxial cable in a bus topology
 repeaters used to connect up to multiple segments
 repeater repeats bits it hears on one interface to its other interfaces: physical layer device only!
 has become a legacy technology

10BaseT and 100BaseT  10/100 Mbps rate; latter called “fast ethernet”
 T stands for Twisted Pair
 Nodes connect to a hub: “star topology”; 100 m max distance between nodes and hub
5: DataLink Layer
5a-
115
 Hubs are essentially physical-layer repeaters:
 bits coming in one link go out all other links
 no frame buffering
 no CSMA/CD at hub: adapters detect collisions
 provides net management functionality
hub
nodes

5: DataLink Layer
Manchester encoding
5a-
116
 Used in 10BaseT, 10Base2
 Each bit has a transition
 Allows clocks in sending and receiving nodes to
synchronize to each other
 no need for a centralized, global clock among nodes!
 Hey, this is physical-layer stuff!

5: DataLink Layer
Gbit Ethernet
5a-
117
 use standard Ethernet frame format
 allows for point-to-point links and shared broadcast
channels
 in shared mode, CSMA/CD is used; short distances
between nodes to be efficient
 uses hubs, called here “Buffered Distributors”
 Full-Duplex at 1 Gbps for point-to-point links
 10 Gbps now !

5: DataLink Layer
Interconnecting LAN segments
5a-
118
 Hubs
 Bridges
 Switches
 Remark: switches are essentially multi-port bridges.
 What we say about bridges also holds for switches!

5: DataLink Layer
Interconnecting with hubs
5a-
119
 Backbone hub interconnects LAN segments
 Extends max distance between nodes
 But individual segment collision domains become one large collision
domian
 if a node in CS and a node EE transmit at same time: collision
 Can’t interconnect 10BaseT & 100BaseT

5: DataLink Layer
Bridges
5a-
120
 Link layer device
 stores and forwards Ethernet frames
 examines frame header and selectively forwards frame
based on MAC dest address
 when frame is to be forwarded on segment, uses
CSMA/CD to access segment
 transparent
 hosts are unaware of presence of bridges
 plug-and-play, self-learning
 bridges do not need to be configured

5: DataLink Layer
Bridges: traffic isolation
5a-
121
 Bridge installation breaks LAN into LAN segments
 bridges filter packets:
 same-LAN-segment frames not usually forwarded onto
other LAN segments
 segments become separate collision domains
bridge
collision
domain
collision
domain
= hub
= host
LAN segment LAN segment
LAN (IP network)

5: DataLink Layer
Forwarding
5a-
122
How do determine to which LAN segment to forward
frame?
• Looks like a routing problem...

5: DataLink Layer
Self learning
5a-
123
 A bridge has a bridge table
 entry in bridge table:
 (Node LAN Address, Bridge Interface, Time Stamp)
 stale entries in table dropped (TTL can be 60 min)
 bridges learn which hosts can be reached through which
interfaces
 when frame received, bridge “learns” location of sender:
incoming LAN segment
 records sender/location pair in bridge table

5: DataLink Layer
Filtering/Forwarding
5a-
124
When bridge receives a frame:
index bridge table using MAC dest address
if entry found for destination
then{
if dest on segment from which frame arrived
then drop the frame
else forward the frame on interface indicated
}
else flood
forward on all but the interface
on which the frame arrived

5: DataLink Layer
Bridge example
5a-
125
Suppose C sends frame to D and D replies back with
frame to C.
 Bridge receives frame from from C
 notes in bridge table that C is on interface 1
 because D is not in table, bridge sends frame into interfaces
2 and 3
 frame received by D

5: DataLink Layer
Bridge Learning: example
5a-
126
 D generates frame for C, sends
 bridge receives frame
 notes in bridge table that D is on interface 2
 bridge knows C is on interface 1, so selectively forwards
frame to interface 1

Interconnection without backbone
5: DataLink Layer
5a-
127
 Not recommended for two reasons:
- single point of failure at Computer Science hub
- all traffic between EE and SE must path over CS segment

5: DataLink Layer
Backbone configuration
5a-
128
Recommended !

5: DataLink Layer
Bridges Spanning Tree
5a-
129
 for increased reliability, desirable to have redundant,
alternative paths from source to dest
 with multiple paths, cycles result - bridges may multiply
and forward frame forever
 solution: organize bridges in a spanning tree by disabling
subset of interfaces
Disabled

5: DataLink Layer
Some bridge features
5a-
130
 Isolates collision domains resulting in higher total
max throughput
 limitless number of nodes and geographical coverage
 Can connect different Ethernet types
 Transparent (“plug-and-play”): no configuration
necessary

5: DataLink Layer
Bridges vs. Routers
5a-
131
 both store-and-forward devices
 routers: network layer devices (examine network layer headers)
 bridges are link layer devices
 routers maintain routing tables, implement routing
algorithms
 bridges maintain bridge tables, implement filtering, learning
and spanning tree algorithms

5: DataLink Layer
Routers vs. Bridges
5a-
132
Bridges + and -
+ Bridge operation is simpler requiring less packet processing
+ Bridge tables are self learning
- All traffic confined to spanning tree, even when alternative
bandwidth is available
- Bridges do not offer protection from broadcast storms

5: DataLink Layer
Routers vs. Bridges
5a-
133
Routers + and -
+ arbitrary topologies can be supported, cycling is limited
by TTL counters (and good routing protocols)
+ provide protection against broadcast storms
- require IP address configuration (not plug and play)
- require higher packet processing
 bridges do well in small (few hundred hosts) while
routers used in large networks (thousands of hosts)

5: DataLink Layer
Ethernet Switches
5a-
134
 Essentially a multi-interface
bridge
 layer 2 (frame) forwarding,
filtering using LAN addresses
 Switching: A-to-A’ and B-to-
B’ simultaneously, no
collisions
 large number of interfaces
 often: individual hosts, star-connected
into switch
 Ethernet, but no collisions!

5: DataLink Layer
Ethernet Switches
5a-
135
 cut-through switching: frame forwarded from input
to output port without awaiting for assembly of
entire frame
 slight reduction in latency
 combinations of shared/dedicated, 10/100/1000
Mbps interfaces

Not an atypical LAN (IP network)
5: DataLink Layer
5a-
136
Dedicated
Shared

5: DataLink Layer
5a-137
Summary comparison
hubs bridges routers switches
traffic
isolation
no yes yes yes
plug & play yes yes no yes
optimal
routing
no no yes no
cut
through
yes no no yes

5: DataLink Layer
IEEE 802.11 Wireless LAN
5a-
138
 802.11b
 2.4-5 GHz unlicensed radio
spectrum
 up to 11 Mbps
 direct sequence spread
spectrum (DSSS) in physical
layer
 all hosts use same
chipping code
 widely deployed, using base
stations
 802.11a
 5-6 GHz range
 up to 54 Mbps
 802.11g
 2.4-5 GHz range
 up to 54 Mbps
 All use CSMA/CA for
multiple access
 All have base-station
and ad-hoc network
versions

5: DataLink Layer
Base station approch
5a-
139
 Wireless host communicates with a base station
 base station = access point (AP)
 Basic Service Set (BSS) (a.k.a. “cell”) contains:
 wireless hosts
 access point (AP): base station
 BSS’s combined to form distribution system (DS)

5: DataLink Layer
Ad Hoc Network approach
5a-
140
 No AP (i.e., base station)
 wireless hosts communicate with each other
 to get packet from wireless host A to B may need to route
through wireless hosts X,Y,Z
 Applications:
 “laptop” meeting in conference room, car
 interconnection of “personal” devices
 battlefield
 IETF MANET
(Mobile Ad hoc Networks)
working group

5: DataLink Layer
IEEE 802.11: multiple access
5a-
141
 Collision if 2 or more nodes transmit at same time
 CSMA makes sense:
 get all the bandwidth if you’re the only one transmitting
 shouldn’t cause a collision if you sense another transmission
 Collision detection doesn’t work: hidden terminal
problem

IEEE 802.11 MAC Protocol: CSMA/CA
5: DataLink Layer
5a-
142
802.11 CSMA: sender
- if sense channel idle for DISF
sec.
then transmit entire frame (no
collision detection)
-if sense channel busy
then binary backoff
802.11 CSMA receiver
- if received OK
return ACK after SIFS
(ACK is needed due to hidden
terminal problem)

5: DataLink Layer
Collision avoidance mechanisms
5a-
143
 Problem:
 two nodes, hidden from each other, transmit complete frames
to base station
 wasted bandwidth for long duration !
 Solution:
 small reservation packets
 nodes track reservation interval with internal “network
allocation vector” (NAV)

Collision Avoidance: RTS-CTS exchange
5: DataLink Layer
5a-
144
 sender transmits short RTS
(request to send) packet:
indicates duration of
transmission
 receiver replies with short
CTS (clear to send) packet
 notifying (possibly hidden)
nodes
 hidden nodes will not
transmit for specified
duration: NAV

Collision Avoidance: RTS-CTS exchange
5: DataLink Layer
5a-
145
 RTS and CTS short:
 collisions less likely, of shorter
duration
 end result similar to collision
detection
 IEEE 802.11 allows:
 CSMA
 CSMA/CA: reservations
 polling from AP

5: DataLink Layer
A word about Bluetooth
5a-
146
 Low-power, small radius,
wireless networking
technology
 10-100 meters
 omnidirectional
 not line-of-sight infared
 Interconnects gadgets
 2.4-2.5 GHz unlicensed
radio band
 up to 721 kbps
 Interference from wireless
LANs, digital cordless
phones, microwave ovens:
 frequency hopping helps
 MAC protocol supports:
 error correction
 ARQ
 Each node has a 12-bit
address

Point to Point Data Link Control
5: DataLink Layer
5a-
147
 one sender, one receiver, one link: easier than
broadcast link:
 no Media Access Control
 no need for explicit MAC addressing
 e.g., dialup link, ISDN line
 popular point-to-point DLC protocols:
 PPP (point-to-point protocol)
 HDLC: High level data link control (Data link used to be
considered “high layer” in protocol stack!

PPP Design Requirements [RFC 1557]
5: DataLink Layer
5a-
148
 packet framing: encapsulation of network-layer
datagram in data link frame
 carry network layer data of any network layer protocol
(not just IP) at same time
 ability to demultiplex upwards
 bit transparency: must carry any bit pattern in the
data field
 error detection (no correction)
 connection liveness: detect, signal link failure to
network layer
 network layer address negotiation: endpoint can
learn/configure each other’s network address

5: DataLink Layer
PPP non-requirements
5a-
149
 no error correction/recovery
 no flow control
 out of order delivery OK
 no need to support multipoint links (e.g., polling)
Error recovery, flow control, data re-ordering
all relegated to higher layers!

5: DataLink Layer
PPP Data Frame
5a-
150
 Flag: delimiter (framing)
 Address: does nothing (only one option)
 Control: does nothing; in the future possible multiple
control fields
 Protocol: upper layer protocol to which frame delivered (eg,
PPP-LCP, IP, IPCP, etc)

5: DataLink Layer
PPP Data Frame
5a-
151
 info: upper layer data being carried
 check: cyclic redundancy check for error detection

5: DataLink Layer
Byte Stuffing
5a-
152
 “data transparency” requirement: data field must be
allowed to include flag pattern <01111110>
 Q: is received <01111110> data or flag?
 Sender: adds (“stuffs”) extra < 01111110> byte after
each < 01111110> data byte
 Receiver:
 two 01111110 bytes in a row: discard first byte, continue
data reception
 single 01111110: flag byte

5: DataLink Layer
Byte Stuffing
5a-
153
flag byte
pattern
in data
to send
flag byte pattern plus
stuffed byte in
transmitted data

5: DataLink Layer
PPP Data Control Protocol
5a-
154
Before exchanging network-layer
data, data link peers
must
 configure PPP link (max.
frame length, authentication)
 learn/configure network
layer information
 for IP: carry IP Control
Protocol (IPCP) msgs (protocol
field: 8021) to configure/learn
IP address

Asynchronous Transfer Mode: ATM
5: DataLink Layer
5a-
155
 1990’s/00 standard for high-speed (155Mbps to
622 Mbps and higher) Broadband Integrated Service
Digital Network architecture
 Goal: integrated, end-end transport of carry voice,
video, data
 meeting timing/QoS requirements of voice, video (versus
Internet best-effort model)
 “next generation” telephony: technical roots in telephone
world
 packet-switching (fixed length packets, called “cells”)
using virtual circuits

5: DataLink Layer
ATM architecture
5a-
156
 adaptation layer: only at edge of ATM network
 data segmentation/reassembly
 roughly analagous to Internet transport layer
 ATM layer: “network” layer
 cell switching, routing
 physical layer

5: DataLink Layer
ATM: network or link layer?
5a-
157
Vision: end-to-end
transport: “ATM from
desktop to desktop”
 ATM is a network
technology
Reality: used to connect IP
backbone routers
 “IP over ATM”
 ATM as switched link
layer, connecting IP
routers

5: DataLink Layer
ATM Adaptation Layer (AAL)
5a-
158
 ATM Adaptation Layer (AAL): “adapts” upper layers (IP
or native ATM applications) to ATM layer below
 AAL present only in end systems, not in switches
 AAL layer segment (header/trailer fields, data) fragmented
across multiple ATM cells
 analogy: TCP segment in many IP packets

ATM Adaptation Layer (AAL) [more]
User data
5: DataLink Layer
5a-
159
Different versions of AAL layers, depending on ATM service
class:
 AAL1: for CBR (Constant Bit Rate) services, e.g. circuit emulation
 AAL2: for VBR (Variable Bit Rate) services, e.g., MPEG video
 AAL5: for data (eg, IP datagrams)
AAL PDU
ATM cell

AAL5 - Simple And Efficient AL (SEAL)
5: DataLink Layer
5a-
160
 AAL5: low overhead AAL used to carry IP
datagrams
 4 byte cyclic redundancy check
 PAD ensures payload multiple of 48bytes
 large AAL5 data unit to be fragmented into 48-byte ATM cells

5: DataLink Layer
ATM Layer
5a-
161
Service: transport cells across ATM network
 analagous to IP network layer
 very different services than IP network layer
Network
Architecture
Internet
ATM
ATM
ATM
ATM
Service
Model
best effort
CBR
VBR
ABR
UBR
Bandwidth
none
constant
rate
guaranteed
rate
guaranteed
minimum
none
Loss
no
yes
yes
no
no
Order
no
yes
yes
yes
yes
Timing
no
yes
yes
no
no
Congestion
feedback
no (inferred
via loss)
no
congestion
no
congestion
yes
no
Guarantees ?

5: DataLink Layer
ATM Layer: Virtual Circuits
5a-
162
 VC transport: cells carried on VC from source to dest
 call setup, teardown for each call before data can flow
 each packet carries VC identifier (not destination ID)
 every switch on source-dest path maintain “state” for each passing
connection
 link,switch resources (bandwidth, buffers) may be allocated to VC: to
get circuit-like perf.
 Permanent VCs (PVCs)
 long lasting connections
 typically: “permanent” route between to IP routers
 Switched VCs (SVC):
 dynamically set up on per-call basis

5: DataLink Layer
ATM VCs
5a-
163
 Advantages of ATM VC approach:
 QoS performance guarantee for connection mapped to
VC (bandwidth, delay, delay jitter)
 Drawbacks of ATM VC approach:
 Inefficient support of datagram traffic
 one PVC between each source/dest pair) does not scale
(N*2 connections needed)
 SVC introduces call setup latency, processing overhead
for short lived connections

5: DataLink Layer
ATM Layer: ATM cell
5a-
164
 5-byte ATM cell header
 48-byte payload
 Why?: small payload -> short cell-creation delay for
digitized voice
 halfway between 32 and 64 (compromise!)
Cell header
Cell format

5: DataLink Layer
ATM cell header
5a-
165
 VCI: virtual channel ID
 will change from link to link thru net
 PT: Payload type (e.g. RM cell versus data cell)
 CLP: Cell Loss Priority bit
 CLP = 1 implies low priority cell, can be discarded if
congestion
 HEC: Header Error Checksum
 cyclic redundancy check

5: DataLink Layer
ATM Physical Layer (more)
5a-
166
Two pieces (sublayers) of physical layer:
 Transmission Convergence Sublayer (TCS): adapts ATM
layer above to PMD sublayer below
 Physical Medium Dependent: depends on physical
medium being used
TCS Functions:
 Header checksum generation: 8 bits CRC
 Cell delineation
 With “unstructured” PMD sublayer, transmission of idle
cells when no data cells to send

5: DataLink Layer
ATM Physical Layer
5a-
167
Physical Medium Dependent (PMD) sublayer
 SONET/SDH: transmission frame structure (like a
container carrying bits);
 bit synchronization;
 bandwidth partitions (TDM);
 several speeds: OC3 = 155.52 Mbps; OC12 = 622.08 Mbps; OC48 =
2.45 Gbps, OC192 = 9.6 Gbps
 TI/T3: transmission frame structure (old telephone
hierarchy): 1.5 Mbps/ 45 Mbps
 unstructured: just cells (busy/idle)

5: DataLink Layer
IP-Over-ATM
5a-
168
Classic IP only
 3 “networks” (e.g., LAN segments)
 MAC (802.3) and IP addresses
IP over ATM
 replace “network” (e.g.,
LAN segment) with
ATM network
 ATM addresses, IP
addresses
ATM
network
Ethernet
LANs
Ethernet
LANs

5: DataLink Layer
IP-Over-ATM
5a-
169
Issues:
 IP datagrams into
ATM AAL5 PDUs
 from IP addresses to
ATM addresses
 just like IP
addresses to 802.3
MAC addresses!
ATM
network
Ethernet
LANs

Datagram Journey in IP-over-ATM Network
5: DataLink Layer
5a-
170
 at Source Host:
 IP layer maps between IP, ATM dest address (using ARP)
 passes datagram to AAL5
 AAL5 encapsulates data, segments cells, passes to ATM layer
 ATM network: moves cell along VC to destination
 at Destination Host:
 AAL5 reassembles cells into original datagram
 if CRC OK, datagram is passed to IP

5: DataLink Layer
Frame Relay
5a-
171
Like ATM:
 wide area network technologies
 Virtual-circuit oriented
 origins in telephony world
 can be used to carry IP datagrams
 can thus be viewed as link layers by IP protocol

5: DataLink Layer
Frame Relay
5a-
172
 Designed in late ‘80s, widely deployed in the ‘90s
 Frame relay service:
 no error control
 end-to-end congestion control

5: DataLink Layer
Frame Relay (more)
5a-
173
 Designed to interconnect corporate customer LANs
 typically permanent VC’s: “pipe” carrying aggregate traffic
between two routers
 switched VC’s: as in ATM
 corporate customer leases FR service from public
Frame Relay network (eg, Sprint, ATT)

flags address data CRC flags
5: DataLink Layer
Frame Relay (more)
5a-
174
 Flag bits, 01111110, delimit frame
 address:
 10 bit VC ID field
 3 congestion control bits
FECN: forward explicit congestion notification
(frame experienced congestion on path)
BECN: congestion on reverse path
DE: discard eligibility

5: DataLink Layer
Frame Relay -VC Rate Control
5a-
175
 Committed Information Rate (CIR)
 defined, “guaranteed” for each VC
 negotiated at VC set up time
 customer pays based on CIR
 DE bit: Discard Eligibility bit
 Edge FR switch measures traffic rate for each VC; marks DE
bit
 DE = 0: high priority, rate compliant frame; deliver at “all
costs”
 DE = 1: low priority, eligible for congestion discard

Frame Relay - CIR & Frame Marking
5: DataLink Layer
5a-
176
 Access Rate: rate R of the access link between source
router (customer) and edge FR switch (provider);
64Kbps < R < 1,544Kbps
 Typically, many VCs (one per destination router)
multiplexed on the same access trunk; each VC has own
CIR
 Edge FR switch measures traffic rate for each VC; it
marks (ie DE = 1) frames which exceed CIR (these may
be later dropped)
 Internet’s more recent differentiated service uses similar
ideas

5: DataLink Layer
Summary
5a-
177
 principles behind data link layer services:
 error detection, correction
 sharing a broadcast channel: multiple access
 link layer addressing, ARP
 link layer technologies: Ethernet, hubs,
bridges, switches,IEEE 802.11 LANs, PPP,
ATM, Frame Relay
 journey down the protocol stack now OVER!
 next stops: multimedia, security, network
management

Network Layer
Goals:
 understand principles
behind network layer
services:
 routing (path selection)
 dealing with scale
 how a router works
 advanced topics: IPv6,
multicast
 instantiation and
implementation in the
Internet
Overview:
 network layer services
 routing principle: path
selection
 hierarchical routing
 IP
 Internet routing protocols
reliable transfer
 intra-domain
 inter-domain
 what’s inside a router?
 IPv6
 multicast routing

Network layer functions
 transport packet from sending to
receiving hosts
 network layer protocols in every
host, router
three important functions:
 path determination: route taken
by packets from source to dest.
Routing algorithms
 switching: move packets from
router’s input to appropriate
router output
 call setup: some network
architectures require router call
setup along path before data flows
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical

Network service model
Q: What service model for
“channel” transporting
packets from sender to
receiver?
 guaranteed bandwidth?
 preservation of inter-packet
timing (no jitter)?
 loss-free delivery?
 in-order delivery?
 congestion feedback to
sender?
The most important
abstraction provided
by network layer:
?? virtual circuit
or
datagram?
?

Virtual circuits
“source-to-dest path behaves much like telephone circuit”
 performance-wise
 network actions along source-to-dest path
 call setup, teardown for each call before data can flow
 each packet carries VC identifier (not destination host ID)
 every router on source-dest path maintains “state” for each passing
connection
 (in contrast, transport-layer connection only involved two end systems)
 link, router resources (bandwidth, buffers) may be allocated to VC
 to get circuit-like performance

Virtual circuits: signaling protocols
 used to set up, maintain, and tear down VC
 used in ATM, frame-relay, X.25
 not used in today’s Internet
application
transport
network
data link
physical
application
transport
network
data link
physical
5. Data flow begins 6. Receive data
4. Call connected 3. Accept call
1. Initiate call 2. incoming call

Datagram networks: the Internet model
 no call setup at network layer
 routers: no state about end-to-end connections
 no network-level concept of “connection”
 packets typically routed using destination host ID
 packets between same source-dest pair may take
different paths
application
transport
network
data link
physical
application
transport
network
data link
physical
1. Send data 2. Receive data

Network layer service models:
Network
Architecture
Internet
ATM
ATM
ATM
ATM
Service
Model
best effort
CBR
VBR
ABR
UBR
Bandwidth
none
constant
rate
guaranteed
rate
guaranteed
minimum
none
Loss
no
yes
yes
no
no
Order
no
yes
yes
yes
yes
Timing
no
yes
yes
no
no
Congestion
feedback
no (inferred
via loss)
no
congestion
no
congestion
yes
no
Guarantees ?
• Internet model being extended: Intserv, Diffserv
– Chapter 6

Datagram or VC network: why?
Internet
 data exchange among computers
 “elastic” service, no strict timing
req.
 “smart” end systems (computers)
 can adapt, perform control, error
recovery
 simple inside network, complexity
at “edge”
 easier to connect many link types
 different characteristics
 uniform service difficult
ATM
 evolved from telephony
 human conversation:
 strict timing, reliability
requirements
 need for guaranteed
service
 “dumb” end systems
 telephones
 complexity inside
network

Routing
Routing protocol
Goal: determine “good” path
(sequence of routers) thru
network from source to dest.
Graph abstraction for
routing algorithms:
 graph nodes are routers
 graph edges are
physical links
 link cost: delay, $ cost,
or congestion level
A
B C
D E
F
2
2
1
3
1
1
5
2
3
5
• “good” path:
– typically means minimum
cost path
– other definitions possible

Routing Algorithm classification
Global or decentralized
information?
Global:
 all routers have complete
topology, link cost info
 “link state” algorithms
Decentralized:
 router knows physically-connected
neighbors, link
costs to neighbors
 iterative process of
computation, exchange of info
with neighbors
 “distance vector” algorithms
Static or dynamic?
Static:
 routes change slowly
over time (usually by
humans)
Dynamic:
 routes change more
quickly/automatically
 periodic update
 in response to link cost
changes

A Link-State Routing Algorithm
Dijkstra’s algorithm
 net topology, link costs known
to all nodes
 accomplished via “link state
broadcast”
 all nodes have same info
 computes least cost paths from
one node (‘source”) to all other
nodes
 gives routing table for that
node
 iterative: after k iterations,
know least cost path to k
destinations
Notation:
 c(i,j): link cost from node i
to j. cost infinite if not
direct neighbors
 D(v): current value of cost
of path from source to
dest. V
 p(v): predecessor node
along path from source to
v, that is next v
 N: set of nodes whose least
cost path definitively
known

Dijsktra’s Algorithm
1 Initialization:
2 N = {A}
3 for all nodes v
4 if v adjacent to A
5 then D(v) = c(A,v)
6 else D(v) = infty
7
8 Loop
9 find w not in N such that D(w) is a minimum (of nodes adjacent to previous w)
10 add w to N
11 update D(v) for all v adjacent to w and not in N:
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N

Dijkstra’s algorithm: example
Step
0
1
2
3
4
5
start N
A
AD
ADE
ADEB
ADEBC
ADEBCF
D(B),p(B)
2,A
2,A
2,A
D(C),p(C)
5,A
4,D
3,E
3,E
D(D),p(D)
1,A
D(E),p(E)
infinity
2,D
D(F),p(F)
infinity
infinity
4,E
4,E
4,E
A
B C
D E
F
2
2
1
3
1
1
5
2
3
5

Dijkstra’s algorithm, discussion
Algorithm complexity: n nodes
 each iteration: need to check all nodes, w, not in N
 n*(n+1)/2 comparisons: O(n**2)
 more efficient implementations possible: O(nlogn)
Oscillations possible:
 e.g., Suppose link cost = amount of carried traffic (note: c(i,j) !=
c(j,i))
A
D
1 1+e
0 0
C
B
0 e
1 1
e
A
2+e 0
D
1+e 1
C
B
0 0
A
0 2+e
D
0 0
C
B
1 1+e
A
2+e 0
D
1+e 1
C
B
0 e
initially
… recompute
routing
… recompute … recompute

[RCC]
[IGP] [EGP]
Networks: Routing
Internetwork Routing [Halsall]
193
Adaptive Routing
Centralized Distributed
Intradomain routing Interdomain routing
[BGP,IDRP]
Distance Vector routing Link State routing
[RIP] [OSPF,IS-IS,PNNI]
Interior
Gateway Protocols
Exterior
Gateway Protocols

Networks: Routing
Distance Vector Routing
194
 Historically known as the old ARPANET routing
algorithm {or known as Bellman-Ford algorithm}.
Basic idea: each network node maintains a Distance
Vector table containing the distance between itself
and ALL possible destination nodes.
 Distances are based on a chosen metric and are
computed using information from the neighbors’
distance vectors.
Metric: usually hops or delay

Networks: Routing
Information kept 195
by DV router
1. each router has an ID
2. associated with each link connected to a router,
there is a link cost (static or dynamic) the metric
issue!
Distance Vector Table Initialization
Distance to itself = 0
Distance to ALL other routers = infinity number

Distance Vector Algorithm [Perlman]
Networks: Routing
196
1. Router transmits its distance vector to each of
its neighbors.
2. Each router receives and saves the most recently
received distance vector from each of its
neighbors.
3. A router recalculates its distance vector when:
a. It receives a distance vector from a neighbor
containing different information than before.
b. It discovers that a link to a neighbor has gone down
(i.e., a topology change).
The DV calculation is based on minimizing the
cost to each destination.

Networks: Routing
197
Figure 5-9.(a) A subnet. (b) Input from A, I, H, K, and
the new routing table for J.

Routing Information Protocol (RIP)
Networks: Routing
198
 RIP had widespread use because it was
distributed with BSD Unix in “routed”, a
router management daemon.
 RIP is the most used Distance Vector
protocol.
 RFC1058 in June 1988.
 Sends packets every 30 seconds or faster.
 Runs over UDP.
 Metric = hop count
 BIG problem is max. hop count =16
 RIP limited to running on small networks!!
 Upgraded to RIPv2

Networks: Routing
Link State Algorithm
199
1. Each router is responsible for meeting its
neighbors and learning their names.
2. Each router constructs a link state packet (LSP)
which consists of a list of names and cost to reach
each of its neighbors.
3. The LSP is transmitted to ALL other routers.
Each router stores the most recently generated
LSP from each other router.
4. Each router uses complete information on the
network topology to compute the shortest path
route to each destination node.

Open Shortest Path First (OSPF)
Networks: Routing
200
 OSPF runs on top of IP, i.e., an OSPF packet is
transmitted with IP data packet header.
 Uses Level 1 and Level 2 routers
 Has: backbone routers, area border routers,
and AS boundary routers
 LSPs referred to as LSAs (Link State
Advertisements)
 Complex algorithm due to five distinct LSA
types.

Area 0.0.0.1
To another AS
Networks: Routing 201
Area 0.0.0.2
Area 0.0.0.3
R1
R2
R3
R4
R5
R6 R7
R8
N1
N2
N3
N4
N5
N6
N7
Area 0.0.0.0
R = router
N =
network
Figure 8.33
OSPF Areas
Copyright ©2000 The McGraw Hill Companies Leon-Garcia & Widjaja: Communication Networks

Networks: Routing
OSPF
202
Figure 5-65.The relation between ASes,
backbones, and areas in OSPF.

Border Gateway Protocol (BGP)
Networks: Routing
203
 The replacement for EGP is BGP. Current version
is BGP-4.
 BGP assumes the Internet is an arbitrary
interconnected set of AS’s.
 In interdomain routing the goal is to find ANY
path to the intended destination that is loop-free.
The protocols are more concerned with
reachability than optimality.

Transport services and protocols
3-
205
 provide logical
communication between
app processes running on
different hosts
 transport protocols run in
end systems
 send side: breaks app
messages into segments,
passes to network layer
 rcv side: reassembles
segments into messages,
passes to app layer
 more than one transport
protocol available to apps
 Internet: TCP and UDP
application
transport
network
data link
physical
application
transport
network
data link
physical
network
data link
network physical
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical

Transport vs. network layer
3-
206
 network layer: logical
communication
between hosts
 transport layer:
logical
communication
between processes
 relies on, enhances,
network layer services
Household analogy:
12 kids sending letters
to 12 kids
 processes = kids
 app messages =
letters in envelopes
 hosts = houses
 transport protocol =
Ann and Bill
 network-layer
protocol = postal
service
Another analogy:
1. Post office -> network layer
2. My wife -> transport layer

Internet transport-layer protocols
3-
207
 reliable, in-order
delivery (TCP)
 congestion control
(distributed control)
 flow control
 connection setup
 unreliable, unordered
delivery: UDP
 no-frills extension of
“best-effort” IP
 services not available:
 delay guarantees
 bandwidth guarantees
application
transport
network
data link
physical
application
transport
network
data link
physical
network
data link
network physical
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
Research issues

Multiplexing/demultiplexing
3-
208
Demultiplexing at rcv host:
delivering received segments
to correct socket
= socket = process
application
transport
network
link
physical
Multiplexing at send host:
gathering data from multiple
sockets, enveloping data with
header (later used for
demultiplexing)
P3 P1 P2 P4
P1 application
transport
network
link
physical
application
transport
network
link
physical
host 1 host 2 host 3
FTP telnet

How demultiplexing works
3-
209
 host receives IP datagrams
 each datagram has source IP
address, destination IP address
 each datagram carries 1 transport-layer
segment
 each segment has source,
destination port number
(recall: well-known port numbers
for specific applications)
 host uses IP addresses & port numbers
to direct segment to appropriate socket
32 bits
source port # dest port #
other header fields
application
data
(message)
TCP/UDP segment format

Connectionless demultiplexing
3-
210
 Create sockets with port
numbers:
DatagramSocket mySocket1 = new
DatagramSocket(99111);
DatagramSocket mySocket2 = new
DatagramSocket(99222);
 UDP socket identified by
two-tuple:
(dest IP address, dest port number)
 When host receives UDP
segment:
 checks destination port
number in segment
 directs UDP segment to
socket with that port
number
 IP datagrams with
different source IP
addresses and/or source
port numbers directed to
same socket (this is how a
system can serve multiple
requests!!)

Connectionless demux (cont)
3-
211
DatagramSocket serverSocket = new DatagramSocket(6428);
Client
IP:B
P3
client
IP: A
Based on destination
PP11 P3
server
IP: C
SP: 6428
DP: 9157
SP: 9157
DP: 6428
SP: 6428
DP: 5775
SP: 5775
DP: 6428
SP provides “return address”
Source IP and port # can be spoofed !!!!
IP and port #

Connection-oriented demux
3-
212
 TCP socket identified
by 4-tuple:
 source IP address
 source port number
 dest IP address
 dest port number
 recv host uses all four
values to direct
segment to
appropriate socket
 Server host may
support many
simultaneous TCP
sockets:
 each socket identified by
its own 4-tuple
 Web servers have
different sockets for
each connecting client
 non-persistent HTTP will
have different socket for
each request

Connection-oriented demux (cont)
3-
213
Client
IP:B
P3
client
IP: A
P3 PP11
server
IP: C
SP: 80
DP: 9157
SP: 9157
DP: 80
SP: 80
DP: 5775
SP: 5775
DP: 80
P4
(S-IP,SP#, D-IP, DP#)

UDP: User Datagram Protocol [RFC 768]
3-
214
 “no frills,” “bare bones”
Internet transport
protocol
 “best effort” service, UDP
segments may be:
 lost
 delivered out of order
to app
 connectionless:
 no handshaking
between UDP sender,
receiver
 each UDP segment
handled independently
of others
Why is there a UDP?
 no connection
establishment (which can
add delay)
 simple: no connection
state at sender, receiver
 small segment header
 no congestion control:
UDP can blast away as
fast as desired

UDP: more
3-
215
 often used for streaming
multimedia apps
 loss tolerant
 rate sensitive
 other UDP uses
 DNS
 SNMP
 reliable transfer over
UDP: add reliability at
application layer
 application-specific
error recovery! (e.g,
FTP based on UDP but
with recovery)
32 bits
length checksum
Application
data
(message)
UDP segment format
Length, in
bytes of UDP
segment,
including
header
When the network is
stressed, you PRAY!

UDP checksum
3-
216
Goal: detect “errors” (e.g., flipped bits) in
Sender:
 treat segment contents
as sequence of 16-bit
integers
 checksum: addition (1’s
complement sum) of
segment contents
 sender puts checksum
value into UDP
checksum field
Receiver:
 compute checksum of
received segment
 check if computed checksum
equals checksum field value:
 NO - error detected
 YES - no error detected.
But maybe errors
nonetheless? More later
….
transmitted segment
e.g: 1+2+3 = 6. So is 0+3+3=6

Internet Checksum Example
3-217
 Note
 When adding numbers, a carryout from the
most significant bit needs to be added to
the result
 Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
sum
checksum

Principles of Reliable data transfer
3-
218
 important in app., transport, link layers
 top-10 list of important networking topics!
abstraction
 characteristics of unreliable channel will determine complexity of reliable data
transfer protocol (rdt)!!!!!!!!
This picture sets the scenario

Reliable data transfer: getting started
3-
219
send
side
deliver_data(): called
by rdt to deliver data to
receive
side
rdt_send(): called from above,
(e.g., by app.). Passed data to
deliver to receiver upper layer
udt_send(): called by rdt,
to transfer packet over
unreliable channel to
receiver
upper
rdt_rcv(): called when packet
arrives on rcv-side of channel
** Let us now look at the gut of these modules. Any question?

(DON’T FALL ASLEEP!!!)
Reliable data transfer: getting started
3-
We’ll:
220
 incrementally develop sender, receiver
sides of reliable data transfer protocol
(rdt)
 consider only unidirectional data transfer
 but control info will flow on both directions!
 use finite state machines (FSM) to specify
sender, receiver
state
1
state
2
event causing state transition
actions taken on state transition
state: when in this
“state” next state
uniquely
determined by
next event
event
actions
Event: timer, receives message, …etc.
Action: executes a program, send message, …etc.

Rdt1.0: reliable transfer over a reliable channel
3-
221
 underlying channel perfectly reliable
 no bit errors
 no loss of packets
In reality, this is an unrealistic assumption, but..
 separate FSMs for sender, receiver:
 sender sends data into underlying channel
 receiver reads data from underlying channel
rdt_send(data)
Wait for
call from
above packet = make_pkt(data)
udt_send(packet)
extract (packet,data)
deliver_data(data)
Wait for
call from
below
rdt_rcv(packet)
sender receiver

Rdt2.0: channel with bit errors
3-
222
 underlying channel may flip bits in packet
 recall: UDP checksum to detect bit errors
 the question: how to recover from errors:
 acknowledgements (ACKs): receiver explicitly tells
sender that pkt received OK
 negative acknowledgements (NAKs): receiver
explicitly tells sender that pkt had errors
 sender retransmits pkt on receipt of NAK
 human scenarios using ACKs, NAKs?
 new mechanisms in rdt2.0 (beyond rdt1.0):
 error detection
 receiver feedback: control msgs (ACK,NAK) rcvr-
>sender
Ack: I love u, I love u 2.
Nak: I love u, I don’t love u

rdt2.0: FSM specification
3-
223
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
Wait for
call from
above
receiver
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(NAK)
Wait for
ACK or
NAK
Wait for
L
call from
sender
below Buffer is needed to
store data from
application layer or
to block call.

rdt2.0: operation with no errors
3-
224
udt_send(sndpkt)
Wait for
call from
above
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(sndpkt)
udt_send(NAK)
Wait for
ACK or
NAK
Wait for
call from
below
rdt_send(data)
L

rdt2.0: error scenario
3-
225
udt_send(sndpkt)
Wait for
call from
above
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(sndpkt)
udt_send(NAK)
Wait for
ACK or
NAK
Wait for
call from
below
rdt_send(data)
L
GOT IT ?

rdt2.0 has a fatal flaw!
3-
226
What happens if
ACK/NAK corrupted?
 sender doesn’t know what
happened at receiver!
 can’t just retransmit:
possible duplicate
What to do?
 sender ACKs/NAKs
receiver’s ACK/NAK? What
if sender ACK/NAK lost?
 retransmit, but this might
cause retransmission of
correctly received pkt!
Handling duplicates:
 sender adds sequence
number to each pkt
 sender retransmits
current pkt if ACK/NAK
garbled
 receiver discards (doesn’t
deliver up) duplicate pkt
stop and wait protocol
Sender sends one packet,
then waits for receiver
response

rdt2.1: sender, handles garbled ACK/NAKs
3-
227
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
Wait for
call 0 from
above
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
Wait for
ACK or
NAK 0 udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt)
&& isACK(rcvpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt) )
udt_send(sndpkt)
Wait for
call 1 from
above
Wait for
ACK or
NAK 1
L
L
THE FSM GETS MESSY!!!

rdt2.1: receiver, handles garbled ACK/NAKs
3-
228
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
Wait for
0 from
below
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq0(rcvpkt)
Wait for
1 from
below
&& has_seq1(rcvpkt)
deliver_data(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq1(rcvpkt)
udt_send(sndpkt)

rdt2.1: discussion
3-
229
Sender:
 seq # added to pkt
 two seq. #’s (0,1) will
suffice. Why?
 must check if received
ACK/NAK corrupted
 twice as many states
 state must “remember”
whether “current” pkt
has 0 or 1 seq. #
Receiver:
 must check if received
packet is duplicate
 state indicates whether 0
or 1 is expected pkt seq #
 note: receiver can not
know if its last
ACK/NAK received OK
at sender

rdt2.2: a NAK-free protocol
3-
230
 same functionality as rdt2.1, using ACKs only
 instead of NAK, receiver sends ACK for last pkt
received OK
 receiver must explicitly include seq # of pkt being ACKed
 duplicate ACK at sender results in same action as
NAK: retransmit current pkt
 This is important because TCP uses this
approach (NO NAC).

rdt2.2: sender, receiver fragments
3-
231
rdt_send(data)
udt_send(sndpkt)
Wait for
call 0 from
above
rdt_rcv(rcvpkt) &&
isACK(rcvpkt,1) )
udt_send(sndpkt)
rdt_rcv(rcvpkt)
&& isACK(rcvpkt,0)
Wait for
ACK
0
sender FSM
fragment
Wait for
0 from
below
&& has_seq1(rcvpkt)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) ||
has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSM
fragment
L

rdt3.0: channels with errors and loss
3-
232
New assumption:
underlying channel
can also lose packets
(data or ACKs)
 checksum, seq. #, ACKs,
retransmissions will be
of help, but not enough
Q: how to deal with
loss?
 sender waits until
certain data or ACK lost,
then retransmits
 yuck: drawbacks?
Approach: sender waits
“reasonable” amount of
time for ACK
 retransmits if no ACK received
in this time
 if pkt (or ACK) just delayed (not
lost):
 retransmission will be
duplicate, but use of seq. #’s
already handles this
 receiver must specify seq # of
pkt being ACKed
 requires countdown timer
What is the “right value” for timer? It depends on the flow and network condition!

3-
233
rdt3.0 sender
rdt_send(data)
udt_send(sndpkt)
start_timer
Wait
for
ACK0
rdt_rcv(rcvpkt) &&
isACK(rcvpkt,1) )
L
timeout
rdt_rcv(rcvpkt)
&& isACK(rcvpkt,0)
Wait for
call 1 from
above
rdt_send(data)
udt_send(sndpkt)
start_timer
rdt_rcv(rcvpkt)
rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1)
rdt_rcv(rcvpkt) &&
isACK(rcvpkt,0) )
stop_timer
stop_timer
udt_send(sndpkt)
start_timer
timeout
udt_send(sndpkt)
start_timer
Wait for
call 0from
above
Wait
for
ACK1
rdt_rcv(rcvpkt)
L
L
L

rdt3.0 in action
3-
234
Timer: tick,tick,…

rdt3.0 in action
3-
235
Is it
necessary
to send
Ack1
again?

Performance of rdt3.0
3-
236
 rdt3.0 works, but performance stinks
 example: 1 Gbps link, 15 ms e-e prop. delay, 1KB
packet:
T
transmit
L (packet length in bits)
R (transmission rate, bps)
= 8kb/pkt
10**9 b/sec
= 8 microsec
U
sender
=
=
.008
30.008
= 0.00027
microsec
onds
L / R
RTT + L / R
=
 U sender: utilization – fraction of time sender busy sending
 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps
link
 network protocol limits use of physical resources!

rdt3.0: stop-and-wait operation
3-
237
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
RTT
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
U
sender
=
.008
30.008
= 0.00027
microsec
onds
L / R
RTT + L / R
=

Pipelined protocols
238
Pipelining: sender allows multiple, “in-flight”,
yet-to-be-acknowledged pkts
 range of sequence numbers must be increased
 buffering at sender and/or receiver
 Two generic forms of pipelined protocols: go-
Back-N, selective repeat

Pipelining: increased utilization
239
first packet bit transmitted, t = 0
sender receiver
last bit transmitted, t = L / R
RTT
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
U
sender
=
.024
30.008
= 0.0008
microsecon
ds
3 * L / R
RTT + L / R
=
Increase utilization
by a factor of 3!

DON’T FALL ASLEEP !!!!!
Go-Back-N (sliding window protocol)
3-
240
Sender:
(For now, treat seq # as unlimited)
 k-bit seq # in pkt header
 “window” of up to N, consecutive unack’ed pkts allowed
 ACK(n): ACKs all pkts up to, including seq # n - “cumulative
ACK”
 Sender may receive duplicate ACKs (see receiver)
 timer for each in-flight pkt
 timeout(n): retransmit pkt n and all higher seq # pkts in window
Q: what happen when a receiver is totally disconnected? MAX RETRY

GBN: sender extended FSM
3-
241
L Buffer data or block higher app.
Wait
timeout
start_timer
udt_send(sndpkt[base])
udt_send(sndpkt[base+1])
…
udt_send(sndpkt[nextseqnum-1])
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
else
refuse_data(data)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
base=1
nextseqnum=1
rdt_rcv(rcvpkt)
&& corrupt(rcvpkt)
No pkt in pipe
Reset timer

GBN: receiver extended FSM
242
default
udt_send(sndpkt)
Wait
rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
&& hasseqnum(rcvpkt,expectedseqnum)
deliver_data(data)
sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
expectedseqnum=1
sndpkt =
make_pkt(expectedseqnum,ACK,chksum)
ACK-only: always send ACK for correctly-received
pkt with highest in-order seq #
 may generate duplicate ACKs
 need only remember expectedseqnum
 out-of-order pkt:
 discard (don’t buffer) -> no receiver buffering!
 Re-ACK pkt with highest in-order seq #
L
If in order pkt is
received, deliver
to app and ack!
Else, just drop it!

GBN
in action
3-
243
Window size=N=4
What determine the
size of window?
1. RTT
2. Buffer at the
receiver(flow
control)
3. Network congestion
Q: GBN has poor performance. How?
Sender sends pkt 1,2,3,4,5,6,7,8,9..
pkt 1 got lost, receiver got pkt 2,3,4,5,… but will discard them!

Selective Repeat (improvement of the
GBN Protocol)
244
 receiver individually acknowledges all
correctly received pkts
 buffers pkts, as needed, for eventual in-order
delivery to upper layer
 E.g., sender: pkt 1,2,3,4,….,10; receiver got
2,4,6,8,10. Sender resends 1,3,5,7,9.
 sender only resends pkts for which ACK not
received
 sender timer for EACH unACKed pkt
 sender window
 N consecutive seq #’s
 again limits seq #s of sent, unACKed pkts

Selective repeat: sender, receiver windows
245
Q: why we have this?
Ack is lost or ack
is on its way

Selective repeat
3-
246
sender
data from above :
 if next available seq # in
window, send pkt
timeout(n) for pkt n:
 resend pkt n, restart timer
ACK(n) in
[sendbase,sendbase+N]:
 mark pkt n as received
 if n smallest unACKed pkt,
advance window base to
next unACKed seq #
receiver
pkt n in [rcvbase, rcvbase+N-1]
 send ACK(n)
 out-of-order: buffer
 in-order: deliver (also
deliver buffered, in-order
pkts), advance window to
next not-yet-received pkt
pkt n in [rcvbase-N,rcvbase-1]
 ACK(n)
otherwise:
 ignore
(slide the window)
Q: why we need this?
The ack got lost.
Sender may
timeout, resend pkt,
we need to ack

Selective repeat in action (N=4)
3-
247
Under GBN, this
pkt will be
dropped.

3-
248
Selective repeat:
dilemma
In real life, we use k-bits to
implement seq #. Practical issue:
Example:
 seq #’s: 0, 1, 2, 3
 window size (N)=3
 receiver sees no
difference in two
scenarios!
 incorrectly passes
duplicate data as new in
(a)
Q: what relationship
between seq # size and
window size?
N <= 2^k/2

Why bother study reliable data transfer?
3-
249
 We know it is provided by TCP, so why bother to
study?
 Sometimes, we may need to implement “some
form” of reliable transfer without the heavy duty
TCP.
 A good example is multimedia streaming. Even
though the application is loss tolerant, but if too
many packets got lost, it affects the visual quality.
So we may want to implement some for of reliable
transfer.
 At the very least, appreciate the “good services”
provided by some Internet gurus.

TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
(The 800 lbs gorilla in the transport stack! PAY ATTENTION!!)
3-
250
 full duplex data:
 bi-directional data flow in
same connection
 MSS: maximum segment
size
 connection-oriented:
 handshaking (exchange of
control msgs) init’s
sender, receiver state
(e.g., buffer size) before
data exchange
 flow controlled:
 sender will not overwhelm
receiver
 point-to-point:
 one sender, one receiver
(not multicast)
 reliable, in-order byte
steam:
 no “message boundaries”
 In App layer, we need
delimiters.
 pipelined:
 TCP congestion and flow
control set window size
 send & receive buffers
socket
door
TCP
send buffer
TCP
receive buffer
socket
door
segment
application
writes data
application
reads data

TCP segment structure
3-
251
32 bits
sequence number
acknowledgement number
Receive window
UAPR S F
checksum Urg data pnter
application
data
(variable length)
head
len
not
used
Options (variable length)
URG: urgent data
(generally not used)
ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection estab
(setup, teardown
commands)
counting
by “bytes”
of data
(not segments!)
# bytes
rcvr willing
to accept
Internet
checksum
(as in UDP)
Due to this
field we have a
variable length
header

Negotiate
during 3-way
handshake
TCP seq. #’s and ACKs
3-
252
Seq. #’s:
 byte stream
“number” of first
byte in segment’s
data
ACKs:
 seq # of next byte
expected from other
side
 cumulative ACK
Q: how receiver handles
out-of-order segments
 A: TCP spec doesn’t
say, - up to
implementor
Host A Host B
User
types
‘C’
host ACKs
receipt
of echoed
‘C’
host ACKs
receipt of
‘C’, echoes
back ‘C’
time
simple telnet scenario

TCP Round Trip Time and Timeout
3-
253
Q: how to set TCP
timeout value?
 longer than RTT
 but RTT varies
 too short: premature
timeout
 unnecessary
retransmissions
 too long: slow reaction
to segment loss, poor
performance.
Q: how to estimate RTT?
 SampleRTT: measured time from
segment transmission until ACK
receipt
 ignore retransmissions
 SampleRTT will vary, want
estimated RTT “smoother”
 average several recent
measurements, not just current
SampleRTT
tx
retx
ack
Estimated
RTT
tx
retx
ack
Estimated
RTT
Too long
Too short

3-
254
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
 Exponential weighted moving average
 influence of past sample decreases exponentially
fast
 typical value:  = 0.125
ERTT(0) = 0
ERTT(1) = (1- )ERTT(0) + SRTT(1)= SRTT(1)
ERTT(2) =(1- ) SRTT(1) + SRTT(2)
ERTT(3) = (1- )(1- ) SRTT(1) + (1- ) SRTT(2) + SRTT(3)

Example RTT estimation:
3-
255
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
300
250
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT

3-
256
Setting the timeout (by Jacobson/Karel)
 EstimtedRTT plus “safety margin”
 large variation in EstimatedRTT -> larger safety margin
 first estimate of how much SampleRTT deviates from
EstimatedRTT:
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)
Then set timeout interval:
TimeoutInterval = EstimatedRTT + 4*DevRTT

TCP reliable data transfer
3-
257
 TCP creates rdt
service on top of IP’s
unreliable service
 Pipelined segments
(for performance)
 Cumulative acks
 TCP uses single
retransmission timer
 Retransmissions are
triggered by:
 timeout events
 duplicate ack ( for
performance reason)
 Initially consider
simplified TCP
sender:
 ignore duplicate acks
 ignore flow control,
congestion control

TCP sender events:
3-
258
data rcvd from app:
 Create segment with
seq #
 seq # is byte-stream
number of first data
byte in segment
 start timer if not
already running (think
of timer as for oldest
unacked segment)
 expiration interval:
TimeOutInterval
timeout:
 retransmit segment
that caused timeout
 restart timer
Ack rcvd:
 If acknowledges
previously unacked
segments
 update what is known
to be acked
 start timer if there are
outstanding segments

3-
259
TCP
sender
(simplified)
NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum
loop (forever) {
switch(event)
event: data received from application above
create TCP segment with sequence number NextSeqNum
if (timer currently not running)
start timer
pass segment to IP
NextSeqNum = NextSeqNum + length(data)
event: timer timeout
retransmit not-yet-acknowledged segment with
smallest sequence number
start timer
event: ACK received, with ACK field value of y
if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}
} /* end of loop forever */
Comment:
• SendBase-1: last
cumulatively
ack’ed byte
Example:
• SendBase-1 = 71;
y= 73, so the rcvr
wants 73+ ;
y > SendBase, so
that new data is
acked

TCP: retransmission scenarios
3-
260
Host A
time
Host B
premature timeout
Seq=92 timeout
Host A
X
loss
timeout
Host B
lost ACK scenario
time
Seq=92 timeout
SendBase
= 100
Sendbase
= 100
SendBase
= 120
SendBase
= 120

TCP retransmission scenarios (more)
3-
261
Host A
X
loss
timeout
Host B
Cumulative ACK scenario
time
SendBase
= 120
Room for improvement

TCP ACK generation [RFC 1122, RFC 2581]
3-
262
Event at Receiver
Arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
Arrival of in-order segment with
expected seq #. One other
segment has ACK pending
Arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
Arrival of segment that
partially or completely fills gap
TCP Receiver action
Delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
Immediately send single cumulative
ACK, ACKing both in-order segments
Ack the “largest in-order byte” seq #
Immediately send duplicate ACK,
indicating seq. # of next expected byte
Immediate send ACK, provided that
segment startsat lower end of gap

Fast Retransmit
3-
263
 Time-out period
often relatively long:
 long delay before
resending lost packet
 Detect lost segments
via duplicate ACKs.
 Sender often sends
many segments back-to-
back
 If segment is lost,
there will likely be
many duplicate ACKs.
 If sender receives 3
ACKs for the same
data, it supposes that
segment after ACKed
data was lost:
 fast retransmit: resend
segment before timer
expires
timeout

Fast retransmit algorithm:
3-
264
event: ACK received, with ACK field value of y
if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}
else {
increment count of dup ACKs received for y
if (count of dup ACKs received for y = 3) {
resend segment with sequence number y
}
a duplicate ACK for
already ACKed segment
fast retransmit
Q: why resend pkt
with seq # y?
A: That is what the
receiver expect!

TCP Flow Control
3-
265
 receive side of TCP
connection has a
receive buffer:
flow control
 speed-matching
service: matching
the send rate to the
receiving app’s drain
rate
 app process may be
slow at reading from
buffer
sender won’t overflow
receiver’s buffer by
transmitting too
much,
too fast

TCP Flow control: how it works
3-
266
(Suppose TCP receiver
discards out-of-order
segments)
 spare room in buffer
= RcvWindow
= RcvBuffer-[LastByteRcvd -
LastByteRead]
 Rcvr advertises spare
room by including
value of RcvWindow in
segments
 Sender limits
unACKed data to
RcvWindow
 guarantees receive
buffer doesn’t overflow
This goes to show that the design
process of header is important!!

TCP Connection Management
3-
267
Recall: TCP sender,
receiver establish
“connection” before
exchanging data segments
 initialize TCP variables:
 seq. #s
 buffers, flow control
info (e.g. RcvWindow)
 client: connection initiator
Socket clientSocket = new
Socket("hostname","port
number");
 server: contacted by client
Socket connectionSocket =
welcomeSocket.accept();
Three way handshake:
Step 1: client host sends TCP SYN
segment to server
 specifies initial seq #
 no data
Step 2: server host receives SYN,
replies with SYN-ACK segment
 server allocates buffers
 specifies server initial seq. #
Step 3: client receives SYN-ACK,
replies with ACK segment,
which may contain data

3-268
TCP three-way handshake
Connectio
n request
Connection
granted
ACK

TCP Connection Management (cont.)
3-
269
Closing a connection:
client closes socket:
clientSocket.close();
Step 1: client end system
sends TCP FIN control
segment to server
Step 2: server receives FIN,
replies with ACK. Closes
connection, sends FIN.
client server
close
close
timed wait
closed
Q: why don’t we
combine ACK and
FIN?
Sender may have
some data in the
pipeline!

TCP Connection Management (cont.)
3-
270
Step 3: client receives FIN,
replies with ACK.
 Enters “timed wait” - will
respond with ACK to
received FINs
Step 4: server, receives ACK.
Connection closed.
Note: with small modification,
can handle simultaneous FINs.
client server
closing
closing
timed wait
closed
closed

TCP Connection Management (cont)
3-
271
TCP client
lifecycle
TCP server
lifecycle

Wwwwww

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Wwwwww

Similar to Wwwwww (20)

More from misgina Mengesha

More from misgina Mengesha (17)

Recently uploaded

Recently uploaded (20)

Wwwwww