Chp4

Computer Networks

Network Layer

Network Layer (2-89-90) 4-1

Chapter 4 Outline

4.1 Introduction and Network Service Models
4.2 Routing Principles
4.3 Hierarchical Routing
4.4 Routing in the Internet
4.5 The Internet (IP) Protocol
4.6 What’s Inside a Router
4.7 IPv6
4.8 Multicast Routing
4.9 Mobility


Network Layer Functions
application
 transport packet from transport
network network
data link
sending to receiving hosts data link
physical physical

 network layer protocols in
every host, router

three important functions:
 path determination: route
taken by packets from source
to dest. (Routing Algorithms) application
transport
 forwarding: move packets network
data link
from router’s input to physical

appropriate router output
 call setup: some network
architectures require router
call setup along path before
data flows

Network Service Model

Q: What service model for
“channel” transporting
packets from sender to The most important
receiver? abstraction provided
by network layer:
Services
 guaranteed bandwidth? virtual circuit
 preservation of inter-packet or
timing (no jitter)? datagram?
 loss-free delivery?
 in-order delivery?
 congestion feedback to
sender?


Virtual circuits

“source-to-destination path behaves much like
telephone circuit”
 performance-wise
 network actions along source-to-destination path

 call setup, teardown for each call before data can flow
 each packet carries VC identifier (not destination host ID)
 every router on source-destination path maintains “state”
for each passing connection
 transport-layer connection only involved two end systems
 Link and router resources (bandwidth, buffers) may be
allocated to VC (dedicated resources = predictable service)
 to get circuit-like performance.


VC implementation

 A VC consists of:
1. path from source to destination
2. VC numbers, one number for each link along path
3. entries in forwarding tables in routers along path

 Packet belonging to VC carries VC number (rather
than destination address)

 VC number can be changed on each link.
 New VC number comes from forwarding table


Forwarding table
VC number
12 a 22 32

1 3
2

Interface number

Routers maintain connection state information!

Forwarding table in router a
Incoming interface Incoming VC # Outgoing interface Outgoing VC #
1 12 3 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …

Virtual Circuits: Signaling Protocols

 used to setup, maintain teardown VC
 used in ATM, frame-relay, X.25
 not used in today’s Internet
6. Receive data
application
3. Accept call
transport
2. Incoming call network
data link
5. Data flow begins physical
4. Call connected
application 1. Initiate call
transport
network
data link
physical


Datagram networks: the Internet model

 no call setup at network layer
 routers: no state about end-to-end connections
 no network-level concept of “connection”
 packets forwarded using destination host address
 packets between same source-destination
pair may take different paths 2. Receive Data application
transport
network
data link
physical

application 1. Send Data
transport
network
data link
physical


Network Layer Service Models:

Guarantees ? Congestion
Network Service
Architecture Model Bandwidth Loss Order Timing feedback

Internet best effort none no no no no (inferred
via loss)
ATM CBR constant yes yes yes no
UBR: Unspecified bit rate

rate congestion
ABR: Available bit rate
CBR: Constant bit rate
VBR: Variable bit rate

ATM VBR guaranteed yes yes yes no
rate congestion
ATM ABR guaranteed no yes no yes
minimum
ATM UBR none no yes no no
 Internet model being extended: Integrated services,
Differentiated Services
 Chapter 6


Datagram or VC Network: why?

Internet (Datagram) ATM (Virtual Circuit)
 data exchange among  evolved from telephony
computers  human conversation:
 “elastic” service, no strict  strict timing, reliability
timing req. requirements
 “smart” end systems  need for guaranteed
(computers) service
 can adapt, perform  “dumb” end systems
control, error recovery  telephones
 simple inside “network”,
 complexity inside
complexity at “edge”
network
 many link types
 different characteristics

 uniform service is
difficult


Buffering in IP routers

Router Internet Router

Network
Network
Interface
Interface

 Buffer size  Dropping packets
Space for bursts of  When?
packets  What?
Latency

FIFO Queueing in the Router
(Drop Tail)

Network
Network
Interface
Interface

 Single queue maintained


(Drop Tail)

Network
Network
Interface
Interface

 Dequeue from head


(Drop Tail)

Network
Network
Interface
Interface

 Enqueue at tail


(Drop Tail)

Network
Network
Interface
Interface

 Enqueue at tail
 When full


(Drop Tail)

Network
Network
Interface
Interface

 Enqueue at tail
 When full drop arriving packet (drop-tail)


Slow Feedback from Drop Tail

 Feedback comes when buffer is completely full
 … even though the buffer has been filling for a while

 Plus, the filling buffer is increasing RTT
 … and the variance in the RTT

 Might be better to give early feedback
 Get one or two flows to slow down, not all of them
 Get these flows to slow down before it is too late


Queue Management

 Performance Degradation in current TCP
Congestion Control
 Multiple packet loss
 Low link utilization
 Congestion collapse

 The role of the router (i.e., network)
 Control congestion effectively with a network
 Allocate bandwidth fairly


Active Queue Management

 Goals:

 Better congestion notification for responsive flows
(i.e. TCP)

 Maintain shorter queues

 Fairness in drops (proportional)


Random Early Detection (RED)

 Invented by Sally Floyd and Van Jacobson in the
early 1990s, differs from the DECbit in two major
ways

 Notification is implicit
 just drop the packet (TCP will timeout)
 could make explicit by marking the packet
 Early random drop
 rather than wait for queue to become full, drop each
arriving packet with some drop probability whenever the
queue length exceeds some drop level


Random Early Detection (RED).

 Basic idea of RED
 Router notices that the queue is getting build-up.
 Randomly drops or marks arriving packets (before
queue gets full).
 Packet drop signals a congestion to the source.

 Packet drop probability
 Drop probability increases as queue length increases
 If buffer is below some level, don’t drop anything
 … otherwise, set drop probability as function of queue


RED Details
 Compute average queue length (Geometric Moving
Average)
n
AvgLenn +1 = (1 − α ) × AvgLenn + α × SampleLenn = ∑ α × (1 − α ) n −i SampleLeni
i =1

0 < α < 1 (usually 0.002)
SampleLen is queue length each time a packet arrives.

MaxThreshold
MaxTh MinTh
MinThreshold

SampleLen
AvgLen

RED Details.

 On arrival of a packet:

calculate AvgLen
if AvgLen <= MinTh then
enqueue arriving packet
if MinTh < AvgLen < MaxTh then
calculate probability P
drop arriving packet with probability P
if AvgLen => MaxTh then
drop arriving packet


RED Details..

 Computing probability P
if : min Th < AvgLen < max Th
max P × ( AvgLen − min Th )
p AvgLen =
max Th − min Th
p AvgLen minTh maxTh
P=
1 − count × p AvgLen 1

maxP
p AvgLen
AvgLen
AvgLen

count counts how long we've been in minTh < AvgLen < maxTh
since we last dropped a packet. i.e. drops are spaced out in
time, reducing likelihood of re-entering slow-start.


RED Detail…

 Weighted Running Average Queue Length

Average Queue Length Drop probability

Max Queue Size
Forced drop
Max Threshold
Probabilistic drops
Min Threshold
No drops
Time


Properties of RED

 Drops packets before queue is full
 In the hope of reducing the rates of some flows

 Drops packet in proportion to each flow’s rate
 High-rate flows have more packets
 … and, hence, a higher chance of being selected

 Drops are spaced out in time
 Which should help desynchronize the TCP senders

 Tolerant of burstiness in the traffic
 By basing the decisions on average queue length

Tuning RED

 MaxP is typically set to 0.02, meaning that when the average
queue size is halfway between the two thresholds, the gateway
drops roughly one out of 100 packets.

 If traffic is bursty, then MinThreshold should be sufficiently
large to allow link utilization to be maintained at an acceptably
high level.

 Difference between two thresholds should be larger than the
typical increase in the calculated average queue length in one
RTT;
setting MaxThreshold to twice MinThreshold is reasonable for
traffic on today’s Internet.

Network Layer (2-89-90) 34 4-34

Problems With RED

 Hard to get the tunable parameters just right
 How early to start dropping packets?
 What slope for the increase in drop probability?
 What time scale for averaging the queue length?
 Sometimes RED helps but sometimes not
 If the parameters aren’t set right, RED doesn’t help
 And it is hard to know how to set the parameters
 RED is implemented in practice
 But, often not used due to the challenges of tuning right
 Many variations
 With cute names like “Blue” and “FRED”…


Explicit Congestion Notification

 Early dropping of packets
 Good: gives early feedback
 Bad: has to drop the packet to give the feedback

 Explicit Congestion Notification
 Router marks the packet with an ECN bit
 … and sending host interprets as a sign of congestion
 Surmounting the challenges
 Must be supported by the end hosts and the routers
 Requires two bits in the IP header (one for the ECN
mark, and one to indicate the ECN capability)
 Solution: borrow two of the Type-Of-Service bits in the
IPv4 packet header

Chapter 4 Outline

 Distance vector routing
 Link state routing
4.7 IPv6
4.9 Mobility


The Problem
“A” “B”

R

How does R choose a
next-hop on the path
towards host B?

Network Layer (2-89-90) CS244a Handout 5 4-38

Interplay between routing, forwarding

routing algorithm

local forwarding table
dest. net. addr. Output port
65/8 3
128.9/16 2
128.9.16/20 2
128.9.19/24 1

dest. IP addr. in arriving
packet’s header
128.9.16.14
1

3 2


Graph abstraction

5
3 w
v 5
2
u 2 1 z
3
1 2
Graph: G = (N,E) x y
1
N = set of routers = { u, v, w, x, y, z }

E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

Remark: Graph abstraction is useful in other network contexts

Example: P2P, where N is set of peers and E is set of TCP connections


Graph abstraction: costs

 c(x,x’) = cost of link (x,x’)
5
 e.g., c(w,z) = 5 v
3 w
2 5
 cost could always be 1, or u 2 1 z
inversely related to 1
3
2
bandwidth, or inversely x
1
y
related to congestion

Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)

Question: What’s the least-cost path between u and z ?

Routing algorithm: algorithm that finds least-cost path


Routing
Routing protocol
Goal: determine “good” path 5 4
(sequence of routers) thru 3 C
B
network from source to dest. 2 5
A 3 1
Graph abstraction for 23 F
routing algorithms: 1 2
 graph nodes are routers D E
1
 graph edges are physical
links Abstract model of a network
 link cost:
 Delay (Make high speed links  “good” path:
attractive, but closeness counts),
 $ cost,
 typically means minimum
 Inverse of bandwidth, cost path
 Path utilization (congestion level
 other definitions possible
& queue length),
 Stability (Is path up or down?)

Technique 1: Naïve Approach
Flood!: Routers forward packets to all ports except
the input port.

R

Advantages:
 Simple.
 Every destination in the network is reachable.
Disadvantages:
 Some routers receive a packet multiple times.
 Packets can go round in loops forever.
 Inefficient.


Spanning Trees

Objective: Find the lowest cost route from each of
(R1, …, R7) to R8.

“A”

R2 R4 R6
1 1 4
R1
2 2 3
R7 2
R3 R5 2
3
R8
4
“B”


A Spanning Tree

1 1 4
R1
2 2 3
R7 2
R3 R5 2
3
R8
4

 The solution is a spanning tree with R8 as the root of the tree.
 Tree: There are no loops.
 Spanning: All nodes included.
 We’ll see two algorithms that build spanning trees automatically:
 The distributed Bellman-Ford algorithm ( Distance Vector ).

 Dijkstra’s shortest path first algorithm ( Link State ).

Routing protocol requirements
 Minimizing route table spec: Node memory related issue
 Minimizing control message: Overhead in bandwidth
 Robustness
 Retain its correctness in dynamic situation. Should be free
of loops, black holes.
 Using optimal paths (optimality)
 Choosing the best path ( in terms of some metrics)
 Stability: Free of oscillations
 Fairness
 Should take the complete topology while computing the
path
 Efficiency: Convergence time
 Correctness


Design Choices - 1
 Centralized versus Distributed routing
 Centralized: One node collects information (node has
complete topology and link cost) and then installs the
routing information in all nodes (Link state algorithm).

 Distributed: All nodes co-operate to form the rooting
table.

 Source based versus hop by hop
 Source routing: data packet contains the hop list.

 Hop by hop: Each hop takes decision based on its routing
table about the next hop (Distance vector algorithm)


Design Choices - 2
 Stochastic versus deterministic
 Stochastic: Routing table contains multiple path
information. Next hop is chosen randomly.
Advantage: load distribution.
 Deterministic: always follow same path.
 Single versus multiple path
 Router can use multiple paths for a single destination

 Dynamic versus Static
 Dynamic: Routing dependent on the current network
state routes update more quickly

periodic update

in response to link cost changes
 Static: Routes update slowly over time.


Assumptions About Router

 Router knows address of each neighbor.
 Router can communicate the information with its
neighbors.
 Router tells its neighbors its best idea of distance
to every other router in the network.
 Router receives these distance vectors from its
neighbors.
 Router updates its notion of best path to each
destination, and the next hop for this destination.


Distance Table Inside Router
 Distance Table data structure
 row for each possible destination.
 column for each directly-attached neighbor router.
 example: in router x, for dest. y via neighbor z.
 This table is made based on exchanged information about
distance metric and calculation.
cost to dest. via
x
D () z z’ z’ z’

Distance Vector=Routing table
z’’
x
y 1 14 5 z,1
y’
destination

y’ 7 5 8 z’,5 z
y’’ y
y’’ 6 9 4 z’’,4

z’,4 y’’’
y’’’ 4 2 11
Distance table in X

Routers Information Exchange

 Routers exchange information periodically of
known:
 distance metric (costs)
 routing table (distance vector)

 Exchange timing:
 whenever a link fails
 Whenever a routing table entry changes.


Distance Vector Routing Algorithm

Iterative:
Distance Table data structure
 continues until no
nodes exchange info.
 each node has its own:
 self-terminating: no row for each possible destination
“signal” to stop column for each directly-
attached neighbor to node
Asynchronous:
 example: in node X, for dest. Y
 nodes need not
via neighbor Z: DX(Y,Z)
exchange info/iterate
in lock step!
X Z
distributed: D (Y,Z) = c(X,Z) + minw {D (Y,w)}
 each node distance from X to Y, via Z as next hop
communicates only with
directly-attached
neighbors

Distance Table: example
neighbor: j
1 C
B A B D
7
8 2
A A 1 14 5

destination: i
1 D B 7 8 5
E
source 2 C 6 9 4
E’s neighbor A D 4 11 2
B’s neighbor E
Distance table: D(i, j)
B
D (A,C)
C E
D (A,B)=
D
B B
E c(E,B) = c(E,B) + minw{D (A,w)}
… = 8 + 6 = 14


Distance table gives routing table

cost to destination via
E Outgoing link
D () A B D E
D () to use, cost

A 1 14 5 A A,1
destination

destination
B 7 8 5 B D,5

C 6 9 4 C D,4

D 4 11 2 D D,4

Distance table Routing table
of node E


Meaning of Distance Vector

 A router using distance vector routing protocols
knows 2 things:
 Distance to final destination
E Outgoing link
 Vector, or direction, traffic D () to use, cost
should be directed.
A A,1
1 C

destination
B
7 B D,5
8 2
A
C D,4
1 D
E
source 2
D D,4


Distance Vector Routing: overview

Iterative, asynchronous:
each local iteration caused
Each node:
by: wait for (change in local link
 local link cost change cost or message from
 message from neighbor: its neighbor)
least cost path change
from neighbor
Distributed: recompute distance table
 each node notifies
neighbors only when its
if least cost path to any
least cost path to any
destination has changed,
destination changes
 neighbors then notify
notify neighbors
their neighbors if
necessary

A Link-State Routing Algorithm

Dijkstra’s algorithm (global)
 net topology, link costs known to all
nodes
 accomplished via “link state

broadcast”
 all nodes have same information

 computes least cost paths from one
node (‘source”) to all other nodes
 gives routing table for that node

 iterative: after k iterations, know least
cost path to k destinations.


Notation:
5 5
 N: set of nodes whose 3 C
B
least cost path 2 5
definitively known A 23
3 1 F
c(i,j): link cost from node 1 2
i to j. cost infinite if D E
1
not direct neighbors
Example: N: A, B, C, D, E, F
 p(v): nodes along path
from source to v C(A,C)=5; C(C,A)=5
 D(v): current value of C(B,D)=2; C(D,B)=3
…
cost of path from
source to destination Source=A
p(F): A-D-E-F
v. D(F)=4


Dijsktra’s Algorithm
v
n = number of nodes (except the source)

1 Initialization:
2 N = {A} D(v) c(w,v)
3 for all nodes v
4 if v adjacent to A
w
5 then D(v) = c(A,v)
6 else D(v) = infinity A
7 D(w)
8 Loop
9 find w not in N such that D(w) is a minimum
10 add w to N
n(n+1)/2) 11 update D(v) for all v adjacent to w and not in N:
times 12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N


Dijkstra’s Algorithm: example

computes least cost paths from node A to all other nodes

Step start N D(B),p(B) D(C),p(C) D(D),p(D) D(E),p(E) D(F),p(F)
0 A 2,A-B 5,A-C 1,A-D infinity infinity
1 AD 2,A-B 4,A-D-C 1,A-D 2,A-D-E infinity
2 ADE 2,A-B 3,A-D-E-C 1,A-D 2,A-D-E 4,A-D-E-F
3 ADEB 2,A-B 3,A-D-E-C 1,A-D 2,A-D-E 4,A-D-E-F
4 ADEBC 2,A-B 3,A-D-E-C 1,A-D 2,A-D-E 4,A-D-E-F
5 ADEBCF 2,A-B 3,A-D-E-C 1,A-D 2,A-D-E 4,A-D-E-F

5
D(v): Distance (cost) of A to v. 3
P(v): nodes along path fromA to v. B C
2 5
A 3 1
2 F
1 2
D E
1

Dijkstra’s Algorithm: discussion2

Oscillations possibility:
 Suppose link costs are equal to
the load carried on the link, or A
1
the delay that experienced. D 1+e
B
0 0
 Link costs are not symmetric, 1 0
C e
c(A,B) equals c(B,A) only if the 1
load on both directions on the e
AB link is the same. Fig. a- Initial routing
 Nodes B and D originates a unit
of traffic destined for A.
 Node C originates e unit for A.


Discussion2 (cont.)

… oscillations possible: 1
A
 Algorithm is run: C D 1+e
B
determines (Fig. a) the 0
0 0
e
1 C
clockwise path to A has a cost 1
of 1, while the
counterclockwise path to A e
Fig. a- Initial routing
has a cost of 1 + e. Hence C ’s
least-cost path to A is now
A
clockwise. 2+e 0
 Similarly, B determines that D B
its new least-cost path to A is 1+e 1
0 0
also clockwise, resulting in 1 C 1
costs shown in Fig. b. e
Fig. b- B, C find better
path to A is clockwise


Discussion2 (cont.)

A
… oscillations possible: 0 2+e
D B
0 0
 When algorithm is run next, 1 1 C 1+e
1
nodes B, C, and D all detect a e
zero-cost path to A in the Fig. c- B, C, D find better
path to A is counterclockwise
counterclockwise direction,
and all route their traffic to A
the counterclockwise routes. 2+e 0
D B
 The next time the LS 1+e 1
algorithm is run, B, C, and D all 1 0 0
C 1
then route their traffic to the e
clockwise routes. Fig. d- B, C, D find better
path to A is clockwise

Dijkstra’s Algorithm: discussion2

To prevent such oscillations:
 Solution1 :link costs not depend on the amount of traffic carried ,an
unacceptable solution since one goal of routing is to avoid highly congested
(for example, high-delay) links.

 Solution2 :all routers do not run the LS algorithm at the same time
(a reasonable solution).
 Routers run the LS algorithm with the same periodicity, the
execution instance of the algorithm would not be the same at
each node.
 Researchers have noted: Routers in the Internet can self-
synchronize among themselves. That is, even though they
initially execute the algorithm with the same period but at
different instants of time, the algorithm execution instance can
eventually become, and remain, synchronized at the routers.
 Avoid such self-synchronization: Introduce randomization into
the period between execution instants of the algorithm at each
node.


Comparison of the DV and the LS

 Distance vector:
 Each router sends distance-vector, but to its neighbours
 The distance-vector contains the estimated distance to
all other nodes
 Older method.
 Link-state:
 Each router sends link-state distance-vector to all
others
 The link-state distance-vector contains the distance to
the neighbours, only
 The distance value to the neighbour (called link-state) is
accurate
 Recent method.


Chapter 4 Outline

4.7 IPv6
4.9 Mobility


Hierarchical Routing

The routing study thus far was idealized
 all routers identical
 network “flat”

… not true in practice
scale: with 200 million administrative
destinations (hosts):
 can’t store all dest’s in routing
autonomy:
tables (memory limitation)!  internet = network of
 routing table exchange would networks
leave no bandwidth left for  each network admin may
sending data packets! want to control routing in its
 DV algorithm that iterated own network
among large number of
routers never converge!

Hierarchical Routing

 aggregate routers into gateway routers
regions, “autonomous  special routers in AS
systems” (AS)  run intra-AS routing
 routers in same AS protocol with all other
routers in AS
run same routing
 also responsible for
protocol routing to destinations
 “intra-AS” routing outside AS
protocol  run inter-AS routing
 routers in different AS protocol with other
can run different intra- gateway routers
AS routing protocol


Routing in the Internet

 The Internet is split into Autonomous Systems
(AS’s)
 Examples of AS’s: Stanford (32), HP (71), MCI Worldcom (17373)
 Try: “MCI Worldcom” in http://ws.arin.net/whois/

 Within an AS, the administrator chooses an Interior
Gateway Protocol (IGP) (Intra AS)
 Examples of IGPs: RIP (rfc 1058), OSPF (rfc 1247).

 Between AS’s, the Internet uses an Exterior
Gateway Protocol (Inter ASs)
 AS’s today use the Border Gateway Protocol, BGP-4 (rfc 1771).


Intra-AS and Inter-AS routing
C.b
B.a Gateways:
A.a •perform inter-AS
b A.c c routing amongst
a themselves
a
b
a •perform intra-AS
C
d B routers with other
c routers in their
A b
AS
▪Routers in an AS
Intra-AS Inter-AS have information
Routing Routing
Inter/intra-AS Algorithm Algorithm about routing paths
routing in within that AS.
gateway A.c
Routing Table

DL DL DL
PHL PHL PHL
To/from A.b To/from B.a and A.a
To/from A.d

Intra-AS and Inter-AS routing

ng
C.b routi B
S and B.a
ter-A n A
A.a In twee
be Host2
b A.c c
a a
b
C a B
d Intra-AS routing
c
A b within AS B
Intra-AS routing
Host1 within AS A


Forwarding Tables

 Forwarding table configured by both intra-
and inter-AS routing algorithm.
 Intra-AS sets entries for internal
destinations.
 Inter-AS & intra-As sets entries for
external destinations.


Inter-AS Tasks

 Suppose router in AS1 AS1 must:
receives datagram destined 1. learn which dests are
outside of AS1: reachable through AS2,
 router should forward which through AS3
packet to gateway 2. propagate this
router, but which one? reachability info to all
routers in AS1
Job of inter-AS routing!

3b 2c
3a
2a
2b
AS3 1a
1d AS2
1c
AS1 1b


Example: Setting forwarding table in router 1d

 Suppose AS1 learns (via inter-AS protocol) that subnet x
reachable via AS3 (gateway 1a) but not via AS2.
 Inter-AS protocol propagates reachability info to all internal
routers.
 router 1d determines from intra-AS routing info that its
interface I is on the least cost path to 1a.
 installs forwarding table entry (x,I)

3a
3b … x
2a
2c

2b
AS3 1a
1d AS2
1c
AS1 1b


Example: Choosing among multiple ASes

 Now suppose AS1 learns from inter-AS protocol that
subnet x is reachable from AS3 and from AS2.

 To configure forwarding table, router 1d must determine
towards which gateway it should forward packets for dest
x.
 this is also job of inter-AS routing protocol!

3b … x ….. 2c
3a
2a
2b
AS3 1a
1d AS2
1c
AS1 1b


Example: Choosing among multiple ASes
 now suppose AS1 learns from inter-AS protocol that
subnet x is reachable from AS3 and from AS2.
 to configure forwarding table, router 1d must
determine towards which gateway it should forward
packets for dest x.
 this is also job of inter-AS routing protocol!
 hot potato routing: send packet towards closest of
two routers.

Use routing info Hot potato routing: Determine from
Learn from inter-AS from intra-AS
Choose the forwarding table the
protocol that subnet protocol to determine interface I that leads
costs of least-cost gateway
x is reachable via to least-cost gateway.
paths to each that has the Enter (x,I) in
multiple gateways of the gateways smallest least cost forwarding table


Chapter 4 Outline

 4.5.1 Intra-AS routing: RIP and OSPF

 4.5.2 Inter-AS routing: BGP

4.6 What’s Inside a Router?
4.7 IPv6
4.9 Mobility

Routing in the Internet
(RC1812) Requirements for IP Version 4 Routers

 The Global Internet consists of Autonomous Systems
(AS) interconnected with each other:
 Stub AS: small corporation: one connection to other AS’s
 Multihomed AS: large corporation (no transit): multiple
connections to other AS’s
 Transit AS: provider, hooking many AS’s together

 Two-level routing:
 Intra-AS: administrator responsible for choice of routing
algorithm within network
 Inter-AS: unique standard for inter-AS routing.


TCP/IP protocol stack

mime
ftp http smtp telnet snmp tftp rtp dns …

Transmission Control Pr. (TCP) User Datagram Pr. (UDP)

… igmp icmp rip ospf bgp …
Internet Protocol (IP)

arp rarp

Ethernet, Wireless, token ring, FDDI, ATM, Frame relay, SNA, X25


Routing Protocols in the Internet-1
Transport layer: TCP, UDP

Routing protocols

Network layer
Control protocols
• path selection:
ICMP, IGMP, … …
RIP, OSPF, BGP, IGRP

forwarding IP protocol
table •addressing conventions
•datagram format
•packet handling conventions

Link layer
physical layer

ICMP: Internet Control Message Protocol, RFC792
IGMP: Internet Group Management Protocol, RFC 2236


Internet Routing Protocol
 Intra-AS: administrator responsible for choice of
routing algorithm within network
 Also known as Interior Gateway Protocols (IGP)
 Most common Intra-AS routing protocols:
 RIP: Routing Information Protocol (RFCs1058,2453)
– It is a distance vector protocol.
– Routing updates are exchanged between neighbors app. Every 30sec.

 OSPF: Open Shortest Path First (RFC2328) (Open Spec.)
 IGRP: Interior Gateway Routing Protocol (Cisco proprietary)
– These are link-state protocol that uses flooding of link information and a
Dijkstra least-cost path algorithm.
 Inter-AS: unique standard for inter-AS routing:
BGP (RFC1771)


Border Gateway Protocol (BGP-4)

 BGP is not a link-state or distance-vector routing
protocol.
 Instead, BGP uses “Path vector”
 BGP advertises complete paths (a list of AS’s).
 Also called AS_PATH (this is the path vector)
 Example of path advertisement:
“The network 171.64/16 can be reached via the path {AS1, AS5, AS13}”.
 Paths with loops are detected locally and ignored.
 Local policies pick the preferred path among
options.
 When a link/router fails, the path is “withdrawn”.


Internet AS Hierarchy

Intra-AS border (exterior gateway) routers

C.b
B.a

C A.a
b A.c c
a
a
b
a
A d B
c
b

Inter-AS (interior gateway) routers

Why different Intra- and Inter-AS routing ?

Policy:
 Inter-AS: admin wants control over how its traffic
routed, who routes through its net.
 Intra-AS: single admin, so no policy decisions needed
Scale:
 hierarchical routing saves table size, reduced update
traffic
Performance:
 Intra-AS: can focus on performance
 Inter-AS: policy may dominate over performance


Chapter 4 outline
 4.4.1 IPv4 addressing
 4.4.2 Moving a datagram from source to destination
 4.4.3 IP address,
 4.4.4 Address depletion
 4.4.5 NAT: Network Address Translation
 4.4.6 Datagram format
 4.4.7 IP fragmentation
 4.4.8 IP Services
 4.4.9 ICMP: Internet Control Message Protocol
 4.4.10 DHCP: Dynamic Host Configuration Protocol

4.7 IPv6
4.9 Mobility

IP Addressing: Introduction

 IP address: 32-bit 223.1.1.1

identifier for host, 223.1.2.1
223.1.1.2
router interface 223.1.1.4 223.1.2.9
 interface: connection 223.1.2.2
between host/router 223.1.1.3 223.1.3.27

and physical link
 router’s typically have
multiple interfaces 223.1.3.1 223.1.3.2
 host may have multiple
interfaces
 IP addresses
associated with each 223.1.1.1 = 11011111 00000001 00000001 00000001
interface
223 1 1 1


IP Addressing

 IP address: 223.1.1.1

 network part (high 223.1.2.1
223.1.1.2
order bits) 223.1.1.4 223.1.2.9
 host part (low order
bits) 223.1.1.3 223.1.3.27
223.1.2.2

 What’s a network ?
LAN
(from IP address
perspective) 223.1.3.1 223.1.3.2
 device interfaces with

same network part of
IP address
 can physically reach
network consisting of 3 IP networks
each other without
intervening router


223.1.1.2
IP Addressing
How to find the 223.1.1.1 223.1.1.4

networks?
223.1.1.3
 Detach each
interface from 223.1.9.2 223.1.7.0
router, host
 create “islands of
isolated networks 223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0

223.1.2.6 223.1.3.27
Interconnected
system consisting 223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2
of six networks.


Getting a datagram from source to dest.

forwarding table in A
Dest. Net. Next Router Nhops
223.1.1 1
IP datagram: 223.1.2 223.1.1.4 2
223.1.3 223.1.1.4 2
misc source dest
data
fields IP addr IP addr
A 223.1.1.1
 datagram remains 223.1.2.1
unchanged, as it travels 223.1.1.2
223.1.1.4 223.1.2.9
source to destination
 addr fields of interest B
223.1.3.27 223.1.2.2 E
here 223.1.1.3
223.1.3.1 223.1.3.2



misc Dest. Net. Next Router Nhops
data
fields 223.1.1.1 223.1.1.3
223.1.1 1
223.1.2 223.1.1.4 2
Starting at A, send IP 223.1.3 223.1.1.4 2
datagram addressed to B:
 look up net. address of B in
A 223.1.1.1
forwarding table
 find B is on same net. as A 223.1.2.1
223.1.1.2
 link layer will send datagram 223.1.1.4 223.1.2.9

directly to B inside link-layer B
223.1.3.27 223.1.2.2 E
frame
223.1.1.3
 B and A are directly
223.1.3.1 223.1.3.2
connected



misc Dest. Net. Next Router Nhops
data
fields 223.1.1.1 223.1.2.3
223.1.1 1
223.1.2 223.1.1.4 2
Starting at A, dest. E:
223.1.3 223.1.1.4 2
 look up network address of E
in forwarding table
A 223.1.1.1
 E on different network
 A, E not directly attached 223.1.2.1
223.1.1.2
223.1.1.4 223.1.2.9
 routing table: next hop
router to E is 223.1.1.4 B
223.1.3.27 223.1.2.2 E
 link layer sends datagram to 223.1.1.3
router 223.1.1.4 inside link-
223.1.3.2
layer frame 223.1.3.1

 datagram arrives at 223.1.1.4
 continued…..


forwarding table in router
misc Dest. Net Router Nhops Interface
data
fields 223.1.1.1 223.1.2.3
223.1.1 - 1 223.1.1.4
Arriving at 223.1.4, 223.1.2 - 1 223.1.2.9
223.1.3 - 1
destined for 223.1.2.2 223.1.3.27

 look up network address of E
A 223.1.1.1
in router’s forwarding table
 E on same network as router’s 223.1.2.1
223.1.1.2
223.1.2.9
interface 223.1.2.9 223.1.1.4

 router, E directly attached B
223.1.3.27 223.1.2.2 E
 link layer sends datagram to 223.1.1.3
223.1.2.2 inside link-layer 223.1.3.2
223.1.3.1
frame via interface 223.1.2.9
 datagram arrives at 223.1.2.2


IP Addresses: Class-full
given notion of “network”, let’s re-examine IP addresses:
“class-full” addressing:

class format range
1.0.0.0 to
A 0 Network Host 126.255.255.255
N.H.H.H

128.0.0.0 to
B 10 Network Host
191.255.255.255
N.N.H.H

192.0.0.0 to
C 110 Network Host
223.255.255.255
N.N.N.H

224.0.0.0 to
D 1110 multicast address
239.255.255.255
240.0.0.0 to
E 11110 experimentation
247.255.255.255
32 bits

Class-full Summary
The Class System
Number of Decimal Number of
Address Number of Number of
Application Network Address Possible
Class Host Bits Networks
Bits Range Host
Large
Class A 8 bits 24 bits 1 - 126 126 16,777,214
Networks
Medium-
Class B 16 bits 16 bits 128 - 191 65,534 65,534
sized
Small
Class C 24 bits 8 bits 192 - 223 2,097,152 254
Networks


Private or Non-Routable addresses

 Some addresses are reserved for use on local networks that are
not connected to the Internet
 Routers do not consider these addresses to be valid Internet
addresses, and will not route a packet to any of them
 These addresses may be used on private networks (not directly
connected to the Internet).
 10.0.0.0/8 —» 10.0.0.0 to 10.255.255.255 (a single class A net)
 172.16.0.0/12 —» 172.16.0.0 to 172.31.255.255 (contiguous
class Bs)
 192.168.0.0/16 —» 192.168.0.0 to 192.168.255.255 (contiguous
class Cs)


Special Purpose IP Addresses

 Several Addresses within the classes are
reserved for special use.
 0.0.0.0 :Source IP Addr. Just after Boot
 network part of dest. Addr.= 0 :Source and
Destination are in same network.
 Dest. Addr.=255.255.255.255 :Broadcast in
Sender’s network.
 host part of Dest.=111… : Broadcast in
destination network.
 Dest. Addr. = 127.anything : Loop Back


Special Purpose Addresses-List
Address Block Present Use Reference
0.0.0.0/8 "This" Network [RFC1700, page 4]
10.0.0.0/8 Private-Use Networks [RFC1918]
14.0.0.0/8 Public-Data Networks [RFC1700, page 181]
24.0.0.0/8 Cable Television Networks
39.0.0.0/8 Reserved, subject to allocation [RFC1797]
127.0.0.0/8 Loop back [RFC1700, page 5]
128.0.0.0/16 Reserved but subject to allocation
169.254.0.0/16 Link Local –
191.255.0.0/16 Reserved but subject to allocation –
192.0.2.0/24 Test-Net
192.88.99.0/24 6to4 Relay Anycast [RFC3068]
198.18.0.0/15 Network Interconnect Device Benchmark Testing [RFC2544]
224.0.0.0/4 Multicast [RFC3171]
240.0.0.0/4 Reserved for Future Use [RFC1700]


Address depletion

 In 1991 IAB identified 3 dangers
 Running out of class B addresses
 Increase in nets has resulted in routing table explosion
 Increase in net/hosts exhausting 32 bit address space

 Four strategies to address
 Creative address space allocation {RFC 2050}
 Private addresses {RFC 1918}, Network Address
Translation (NAT) {RFC 1631}
 Classless Inter-Domain Routing (CIDR) {RFC 1519}
 IP version 6 (IPv6) {RFC 1883}


Creative IP address allocation

 Class A addresses 64 – 127 reserved
 Handle on individual basis
 Class B only assigned given a demonstrated need
 Class C
 divided up into 8 blocks allocated to regional authorities
 208-223 remains unassigned and unallocated
 Three main registries handle assignments
 APNIC – Asia & Pacific www.apnic.net
 ARIN – N. & S. America, Caribbean & sub-Saharan Africa
www.arin.net
 RIPE – Europe and surrounding areas www.ripe.net


NAT: Network Address Translation-1

 Motivation: local network uses just one IP address
as far as outside word is concerned:
 no need to be allocated range of addresses
from ISP: - just one IP address is used for all
devices
 can change addresses of devices in local
network without notifying outside world
 can change ISP without changing addresses of
devices in local network
 devices inside local net not explicitly
addressable, visible by outside world (a security
plus).


Private IP Network

 Private IP network is an IP network that is not
directly connected to the Internet.
 IP addresses in a private network can be assigned
arbitrarily.
 Not registered and not guaranteed to be globally unique.

 Generally, private networks use addresses from
the following experimental address ranges (non-
routable addresses):
 10.0.0.0 – 10.255.255.255
 172.16.0.0 – 172.31.255.255
 192.168.0.0 – 192.168.255.255



Implementation: NAT router must:
 outgoing datagrams: replace (source IP address, port #)
of every outgoing datagram to (NAT IP address, new
port #)
. . . remote clients/servers will respond using (NAT IP
address, new port #) as destination addr.

 remember (in NAT translation table) every (source IP
address, port #) to (NAT IP address, new port #)
translation pair

 incoming datagrams: replace (NAT IP address, new port
#) in dest fields of every incoming datagram with
corresponding (source IP address, port #) stored in NAT
table



local network
rest of Private IP Network
Internet (e.g., home network)
10.0.0/24 10.0.0.1

10.0.0.4
10.0.0.2

138.76.29.7

10.0.0.3

All datagrams leaving local Datagrams with source or
network have same single source destination in this network
NAT IP public address: have 10.0.0/24 address
138.76.29.7, (private address)
different source port numbers



NAT translation table
2: NAT router 1: host 10.0.0.1
WAN side addr LAN side addr
changes datagram sends datagram to
138.76.29.7, 5001 10.0.0.1, 3345 128.119.40, 80
source addr from
…… ……
10.0.0.1, 3345 to
138.76.29.7, 5001, S: 10.0.0.1, 3345
updates table D: 128.119.40.186, 80
10.0.0.1
1
S: 138.76.29.7, 5001
2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2
138.76.29.7 S: 128.119.40.186, 80
D: 10.0.0.1, 3345 4
S: 128.119.40.186, 80
D: 138.76.29.7, 5001 3 10.0.0.3
4: NAT router
3: Reply arrives changes datagram
dest. address: dest addr from
138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345



 16-bit port-number field:
 232 = 65,536 simultaneous connections with a
single LAN-side address!

 NAT is controversial:
 routers should only process up to layer 3
 violates end-to-end argument

NAT possibility must be taken into account by app
designers, eg. P2P applications.

address shortage should instead be solved by IPv6


Nat & Applications

 IP address in application data:
 Applications that carry IP addresses in the

payload of the application data generally do not
work across a private-public network boundary.

 Some NAT devices inspect the payload of
widely used application layer protocols and, if
an IP address is detected in the application-
layer header or the application payload,
translate the address according to the address
translation table.


Example: NAT & FTP
Public Network

FTP Client
FTP Server No NAT Device
Public Address Public Address
147.202.71.22 207.3.18.98

PORT 207.3.18.98, 1107

200 Port Command Successful

RETR file1

150 Opening Data Connection

Establish Data Connection

Client gives its IP address and port number for data connection.
Server starts data connection.

Example: NAT & FTP
Public Network Private Network
NAT Device with FTP Client
FTP Server FTP Support
Public Address Private Address
147.202.71.22 10.0.1.1
Public Address
207.3.18.98

PORT command
in IP packet must PORT 207.3.18.98,1107
PORT 207.3.18.98,1107 PORT 10.0.1.1, 1107
PORT 10.0.1.1, 1107
be modified.
200 Port Command Successful 200 Port Command Successful

RETR file1
RETR file1 RETR file1
RETR file1

150 Opening Data Connection 150 Opening Data Connection

Establish Data Connection Establish Data Connection


NAT Traversal Problem

 Client wants to connect to
server with address 10.0.0.1
10.0.0.1
 server address 10.0.0.1 local Client
to LAN (client can’t use it as
destination addr) NAT
router 10.0.0.4
 only one externally visible
NATted address: 138.76.29.7
 Solution 1: statically 138.76.29.7
configure NAT to forward
incoming connection
requests at given port to
server
 e.g., connection request at:
(123.76.29.7, port 80) always
forwarded to (10.0.0.1 port
1405)

NAT traversal problem

 Solution 2: Universal Plug and
Play (UPnP) Internet Gateway 10.0.0.1
Device (IGD) Protocol allows IGD
NATted host to: NAT
 learn public IP address router 10.0.0.4

(138.76.29.7)
 add/remove port mappings 138.76.29.7

(with lease times)
 i.e., automate static NAT

port map configuration


NAT traversal problem

 Solution 3: relaying (used in Skype)
 NATted client establishes connection to relay

 External client connects to relay

 relay bridges packets between to connections

NATted Host

2. connection to 1. connection to 10.0.0.1
relay initiated relay initiated
by client by NATted host
10.0.0.4
3. relaying
Client established
NAT
138.76.29.7 router


IP addressing: CIDR

 Classful addressing:
 inefficient use of address space, address space exhaustion
 e.g., class B net allocated enough addresses for 65K hosts,
even if only 2K hosts in that network
 CIDR: Classless Inter Domain Routing (RFC1519)
 network portion of address of arbitrary length
 address format: a.b.c.d/x, where x is # bits in network
portion of address

network host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23

Subnet Mask-1

 A subnet mask is applied to the host bits to
determine how the network is subnetted,
 e.g. if the host is: 137.138.28.228, and the subnet mask
is 255.255.255.0 then the right hand 8 bits are for the
host (255 is decimal for all bits set in an octet)


Bit Masks and Subnet Masks
In a production environment this prefix typically
varies in length from 8 to 30 bits

/8 = 255.0.0.0 /16 = 255.255.0.0 /24 = 255.255.255.0
/9 = 255.128.0.0 /17 = 255.255.128.0 /25 = 255.255.255.128
/10 = 255.192.0.0 /18 = 255.255.192.0 /26 = 255.255.255.192
/11 = 255.224.0.0 /19 = 255.255.224.0 /27 = 255.255.255.224
/12 = 255.240.0.0 /20 = 255.255.240.0 /28 = 255.255.255.240
/13 = 255.248.0.0 /21 = 255.255.248.0 /29 = 255.255.255.248
/14 = 255.252.0.0 /22 = 255.255.252.0 /30 = 255.255.255.252
/15 = 255.254.0.0 /23 = 255.255.254.0 /31 = not usable
/32 = not usable

/30 yields two usable hosts and is used for WAN connections


Chp4

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Chp4

Similar to Chp4 (20)

Chp4

Editor's Notes