3. Interdomain routing
l Goals
l Allow to transmit IP packets along the best
path towards their destination through
several transit domains while taking into
account their routing policies of each
domain without knowing their detailed
topology
l From an interdomain viewpoint, best path
often means cheapest path
u Each domain is free to specify inside its
routing policy the domains for which it
agrees to provide a transit service and
the method it uses to select the best path
to reach each destination
4. Routing policies
l A domain specifies its routing policy by
defining on each BGP router two sets of filters
for each peer
l Import filter
u Specifies which routes can be accepted by
the router among all the received routes
from a given peer
l Export filter
u Specifies which routes can be advertised by
the router to a given peer
5. Agenda
• Routing in IP networks
• Interdomain routing
• Peering links
• BGP basics
• BGP convergence
6. Border Gateway Protocol l Path vector protocol
u BGP router advertises its best route to each
destination
AS1 AS2
AS4
2001:db8:1/48
AS5
lprefix:2001:db8:1/48
lASPath: AS1
lprefix: 2001:db8:1/48
ASPath: ::AS2:AS4:AS1
lprefix: 2001:db8:1/48
lASPath: AS4:AS1
lprefix: 2001:db8:1/48
ASPath: AS1
l ... with incremental updates
7. BGP : Principles
operation l Principles
l BGP relies on the
incremental exchange of path vectors
BGP session established
over
TCP connection between
peers
Each peer sends all its
active routes
As long as the BGP session
remains up
Incrementally update BGP routing
tables
AS3
R1
R2
AS4
BGP
session
BGP Msgs
8. BGP basics
(2)
l 2 types of BGP messages
l UPDATE (path vector)
u advertises a route towards one prefix
u Destination address/prefix
u Interdomain path (AS-Path)
u Nexthop
l WITHDRAW
u a previously announced route is not
reachable anymore
u Unreachable destination address/prefix
9. BGP router
BGP Loc-RIB
Peer[N]
All
BGP Msgs
from Peer[N] BGP Msgs
Peer[1]
Import filter
Attribute
manipulation
Peer[N]
Peer[1]
Export filter
Attribute
manipulation
acceptable
routes
BGP Decision
Process
BGP Routing Information Base
Contains all the acceptable routes
learned from all Peers + internal routes
l BGP decision process selects
the best route towards each destination
BGP Msgs
from Peer[1]
to Peer[N]
BGP Msgs
to Peer[1]
Import filter(Peer[i])
Determines which BGM Msgs
are acceptable from Peer[i] Export filter(Peer[i])
Determines which
routes can be sent to Peer[i]
One best
route to each
destination
BGP Adj-RIB-In
BGP Adj-RIB-Out
10. Example
AS20
R2
AS30
AS10
UPDATE
lprefix: 2001:db8:12/48,
lNextHop:R1
lASPath: AS10
UPDATE
lprefix: 2001:db8:12/48,
lNextHop:R2
lASPath: AS20:AS10
R1 R3
2001:db8:12/48
BGP
R4
AS40
BGP
BGP
UPDATE
lprefix: 2001:db8:12/48,
lNextHop:R1
lASPath: AS10
UPDATE
lprefix: 2001:db8:12/48,
lNextHop:R4
lASPath: AS40:AS10
l What happens if link AS10-AS20 goes down ?
11. How to prefer some
routes over others ?
RA RB
R1
Backup: 2Mbps
Primary: 34Mbps
AS1
AS2
12. How to prefer some
routes over others
• Limitations
RA
R1 R2
R3
RB
Cheap
Expensive
AS1
AS2
AS3
AS4
R5 AS5
13. BGP router
BGP RIB
Peer[N]
Peer[1]
Import filter
Attribute
manipulation
Peer[N]
Peer[1]
Export filter
Attribute
manipulation
BGP Msgs
from Peer[N]
BGP Msgs
from Peer[1]
BGP Msgs
to Peer[N]
BGP Msgs
All
acceptable
routes
BGP Decision
Process
One best to Peer[1]
route to each
destination
Import filter
l Selection of acceptable routes
l Addition of local-pref attribute
inside received BGP Msg
lNormal quality route : local-pref=100
lBetter than normal route :local-pref=200
lWorse than normal route :local-pref=50
Simplified BGP Decision Process
l Select routes with highest
local-pref
l If there are several routes,
choose routes with the
shortest ASPath
l If there are still several routes
tie-breaking rule
14. BGP session
• Session establishment
def initialize_BGP_session( RemoteAS, RemoteIP):
# Initialize and start BGP session
# Send BGP OPEN Message to RemoteIP on port 179
# Follow BGP state machine
# advertise local routes and routes learned from
peers*/
for d in BGPLocRIB :
B=build_BGP_Update(d)
S=Apply_Export_Filter(RemoteAS,B)
if (S != None) :
send_Update(S,RemoteAS,RemoteIP)
# entire RIB has been sent
# new Updates will be sent to reflect local or
distant
# changes in routers
15. Simple export filter
def apply_export_filter(RemoteAS, BGPMsg) :
# check if RemoteAS already received route
if RemoteAS is BGPMsg.ASPath :
BGPMsg=None
# Many additional export policies can be
configured :
# Accept or refuse the BGPMsg
# Modify selected attributes inside BGPMsg
return BGPMsg
16. Simple import filter
def apply_import_filter(RemoteAS, BGPMsg):
if MysAS in BGPMsg.ASPath :
BGPMsg=None
# Many additional import policies can be
configured :
# Accept or refuse the BGPMsg
# Modify selected attributes inside BGPMsg
return BGPMsg
17. Processing UPDATE
def Recvd_BGPMsg(Msg, RemoteAS) :
B=apply_import_filer(Msg,RemoteAS)
if (B== None): # Msg not acceptable
return
if IsUPDATE(Msg):
Old_Route=BestRoute(Msg.prefix)
Insert_in_RIB(Msg)
Run_Decision_Process(RIB)
if (BestRoute(Msg.prefix) != Old_Route) :
# best route changed
B=build_BGP_Message(Msg.prefix);
S=apply_export_filter(RemoteAS,B);
if (S!=None) : # announce best route
send_UPDATE(S,RemoteAS,RemoteIP);
else if (Old_Route != None) :
send_WITHDRAW(Msg.prefix,RemoteAS, RemoteIP)
18. Processing
WITHDRAW
else : # Msg is WITHDRAW
Old_Route=BestRoute(Msg.prefix)
Remove_from_RIB(Msg)
Run_Decision_Process(RIB)
if (Best_Route(Msg.prefix) !=Old_Route):
# best route changed
B=build_BGP_Message(Msg.prefix)
S=apply_export_filter(RemoteAS,B)
if (S != None) : # still one best route towards Msg.prefix
send_UPDATE(S,RemoteAS, RemoteIP);
else if(Old_Route != None) : # No best route anymore
send_WITHDRAW(Msg.prefix,RemoteAS,RemoteIP);
19. How to prefer routes
?routes over others (3)
?
RA RB
R1
Backup: 2Mbps
Primary: 34Mbps
AS1
AS2
RPSL-like policy for AS1
aut-num: AS1
import: from AS2 RA at R1 set localpref=100;
from AS2 RB at R1 set localpref=200;
accept ANY
export: to AS2 RA at R1 announce AS1
to AS2 RB at R1 announce AS1
RPSL-like policy for AS2
aut-num: AS2
import: from AS1 R1 at RA set localpref=100;
from AS1 R1 at RB set localpref=200;
accept AS1
export: to AS1 R1 at RA announce ANY
to AS2 R1 at RB announce ANY
20. How to prefer routes ?
routes over others (4) ?
RA
R1 R2
R3
RB
Cheap
Expensive
AS1
AS2
AS3
AS4
R5 AS5
RPSL policy for AS1
aut-num: AS1
import: from AS2 RA at R1 set localpref=100;
from AS4 R2 at R1 set localpref=200;
accept ANY
export: to AS2 RA at R1 announce AS1
to AS4 R2 at R1 announce AS1
u AS1 will prefer to send over cheap link
u But the flow of the packets destined to
AS1 will depend on the routing policy of
the other domains
22. Limitations of local-pref
l In theory
u Each domain is free to define its order of
preference for the routes learned from
external peers
2001:db8:1/48
AS1
Preferred paths for AS4
Preferred paths for AS3
AS3 AS4
1. AS3:AS1
2. AS1
1. AS4:AS1
2. AS1
u How to reach 1.0.0.0/8 from AS3 and
23. Limitations of local-pref
l AS1 sends its UPDATE messages ...
2001:db8:1/48
AS1
lP: 2001:db8:1/48
lASPath: AS1
AS3 AS4
UPDATE
Preferred paths for AS3
1. AS4:AS1
2. AS1
Routing table for AS3
2001:db8:1/48 ASPath: AS1 (best)
Preferred paths for AS4
1. AS3:AS1
2. AS1
UPDATE
lP: 2001:db8:1/48
lASPath: AS1
Routing table for AS4
2001:db8:1/48 ASPath: AS1 (best)
24. Limitations of local-pref
l First possibility
l AS3 sends its UPDATE first...
AS1
AS3 AS4
Preferred paths for AS3
1. AS4:AS1
2. AS1
2001:db8:1/48
Routing table for AS3
2001:db8:1/48 ASPath: AS1 (best) UPDATE
lP: 2001:db8:1/48
lASPath: AS3:AS1
Preferred paths for AS4
1. AS3:AS1
2. AS1
Routing table for AS4
2001:db8:1/48 ASPath: AS1
2001:db8:1/48 ASPath:AS3:AS1 (best)
u Stable route assignment
25. Limitations of local-pref
l AS4 sends its UPDATE first...
l2001:db8:1/48
AS1
Preferred paths for AS4
1. AS3:AS1
2. AS1
AS3 AS4
Routing table for AS4
2001:db8:1/48 ASPath: AS1 (best)
Preferred paths for AS3
1. AS4:AS1
2. AS1
UPDATE
lPrefix: 2001:db8:1/48
lASPath: AS4:AS1
Routing table for AS3
2001:db8:1/48 ASPath: AS1
2001:db8:1/48 ASPath: AS4:AS1 (best)
u Another (but different) stable route assignment
26. Limitations of local-pref
l AS3 and AS4 send their UPDATE together...
AS1
AS3 AS4
Preferred paths for AS3
1. AS4:AS1
2. AS1
2001:db8:1/48
UPDATE
lP: 2001:db8:1/48
lASPath: AS3:AS1
Preferred paths for AS4
1. AS3:AS1
2. AS1
UPDATE
lP: 2001:db8:1/48
ASPath: AS4:AS1
u AS3 prefers indirect path -> withdraw
u AS4 prefers indirect path -> withdraw
27. Limitations of local-pref
l AS3 and AS4 send their UPDATE together...
AS1
AS3 AS4
Preferred paths for AS3
1. AS4:AS1
2. AS1
2001:db8:1/48
Preferred paths for AS4
1. AS3:AS1
2. AS1
WITHDRAW
lP: 2001:db8:1/48
WITHDRAW
lP: 2001:db8:1/48
u AS3 : indirect route is not available anymore
u AS3 will reannounce its direct route...
u AS4 : indirect route is not available anymore
u AS4 will reannounce its direct route...
28. More limitations
local pref
l Unfortunately, interdomain routing may not
converge at all in some cases...
AS1
AS0
Preferred paths for AS1
1. AS3:AS0
2. other paths
Preferred paths for AS4
1. AS1:AS0
2. other paths
AS3 AS4
Preferred paths for AS3
1. AS4:AS0
2. other paths
u How to reach a destination inside AS0 in
this case ?
29. local-pref and
economical
l In practicer, elolcaalti-oprnefsish oiftpens combined
with filters to enforce economical
relationships
Prov1 Prov2
$ $
AS1
Peer1
Peer2
Peer3
Peer4
Cust1 Cust2
$ Customer-provider
$
Shared-cost
$
Local-pref values used by AS1
> 1000 for the routes received from a Customer
500 – 999 for the routes learned from a Peer
< 500 for the routes learned from a Provider
30. local-pref
l Which route will be used by AS1 to reach AS5 ?
AS1
$
$
AS4
AS2
AS3
Shared-cost
$
AS5 $ Customer-provider
$
l and how will AS5 reach AS1 ?
$
$
AS8
AS6
AS7
$
$
Internet paths are often asymmetrical
31. Intern•eNt S1FN9et90s
• American backbone
• AUP : no commercial
traffic
• Some regional
networks
• US regions
• national networks in
Europe
• Universities/research
labs
• connected to regional
42. l Most widely used LAN
Ethernet/802.3
l First developed by Digital, Intel and Xerox
l Standardised by IEEE and ISO
• Medium Access Control
• CSMA/CD with exponential backoff
• Bandwidth: 10 Mbps
• Two ways delay
• 51.2 microsec on Ethernet/802.3
• => minimum frame size : 512 bits
• Cabling
• 10Base5 : (thick) coaxial cable
44. Ethernet Addresses
l Each Ethernet adapter has a unique
address
l Ethernet Address format
l 48 bits addresses
u Source Address
24 bits OUI
(adapter manufacturer)
00
u Destination address
24 bits
(identifier of adapter)
u If high order bit is 0, host unicast
address
• If high order bit is 1, host multicast address
45. Ethernet
Frames l DIX Format
• proposed by Digital, Intel and Xerox
Preamble
[8 bytes]
Destination
address
Source address
Type
[2 bytes]
Data
[46-1500 bytes
CRC [32 bits]
Used to mark the beginning of the frame
Allows the receiver to synchronise its
clock to the sender’s clock
Indication of the type of packet contained
inside the frame
Upper layer protocol must ensure that
the payload of the Ethernet frame is
at least 46 bytes and at most 1500 bytes
46. Ethernet
Service
l An Ethernet network provides a
connectionless unreliable service
l Transmission modes
l unicast, multicast, broadcast
l Even if in theory the Ethernet service is
unreliable, a good Ethernet network should
l deliver frames to their destination with a very
hig probability of delivery
l not reorder the transmitted frames
• reordering is obviously impossible on a
bus
47. Ethernet with
structured cabling
• How to perform CSMA/CD in a star-shaped
network ?
Collision domain : set of stations that could be in collision
48. Ethernet hub
l A hub is a relay operating in the physical
layer
Datalink Datalink
Physical
Physical
Host A Hub Host B
49. Larger Ethernet
Hub
l With hubs ?
l Interconnect hubs together
Hub
Issue : maximum 51.2 microsec
delay between any pair of stations
Collision domain : entire network
Hub
Hub
50. Ethernet Switch
• Ethernet switch
• understands MAC addresses and filters
frames based on their addresses
Address Port
A West
B South
C East
Eth : A
Eth : B
Eth : C
Src:A Dst:B
51. Ethernet switch
l A switch is a relay that operates in the
datalink layer
Network Network
Datalink
Physical Phys. Phys.
Datalink
Physical
Host A Switch Host B
52. Frame processing
Arrival of frame F on port P
src=F.Source_Address;
dst=F.Destination_Address;
UpdateTable(src, P); // src heard on port P
if (dst==broadcast) || (dst is multicast)
{
for(Port p!=P) // forward all ports
ForwardFrame(F,p);
}
else
{
if(dst isin AddressPortTable) { ForwardFrame(F,AddressPortTable(}
53. •NHoew ttow deosirgnk n ertweodrksu thnatd suarvnivec linyk
and node failures ?
• Add reAdddurenssdanPotr tswitches
A West
B South
C East
Eth : A
Eth : B
Eth : C
Src:A Dst:C
Address Port
A North
B South
C East
Address Port
A West
B South
C North
54. Network redundancy
(2)
Address Port
Eth : A
Eth : B
Eth : C
Src:A Dst:C
Address Port
Address Port
55. Spanning tree
• Each switch has a unique identifier
• The switch with the lowest id is the root
• Disable all links that do not belong to the
Switch 1
Switch 7
Switch 9
Switch 2
Switch 22
Switch 44
spanning tree
56. Building the spanning
tree l 802.1d protocol
l 802.1d uses Bridge PDUs (BPDUs)
containing
u Root ID : identifier of the current root
switch
u Cost : Cost of the shortest path between
the switch transmitting the BPDU and the
root switch
u Transmitting ID : identifier of the switch
that transmits the BPDU
l The BPDUs are sent by switches over their
attached LANs as multicast frames but
they are never forwarded
• switches that implement 802.1d listen to a
57. Ordering of BPDUs
l BPDUs can be strictly ordered
l BPDU11[R=R1,C=C1, T=T1] is better than
BPDU2 [R=R2,C=C2, T=T2] if
u R1<R2
u R1=R2 and C1<C2
u R1=R2 and C1=C2 and T1<T2
l Example
BPDU1 BPDU2
R1 C1 T1 R2 C2 T2
29 15 35 31 12 32
35 80 39 35 80 40
35 15 80 35 18 38
58. 802.1d port states
l 802.1d port state based on received BPDUs
l Root port
u port on which best 802.1d BPDU received
l Designated port
u a port is designated if the switch’s BPDU
is better than the best BPDU received on
this port
u Switch’s BPDU is
u current root, cost to root, switch identifier
l Blocked port (only receives 802.1d
BPDUs)
59. Port states and
activityReceive
BPDUs
Transmit
BPDUs
Blocked yes no
Root yes no
Designated yes yes
Learn
Addresses
Forward Data
Frames
Inactive no no
Active yes yes
Editor's Notes
RFC 2622 Routing Policy Specification Language (RPSL). C. Alaettinoglu, C.
Villamizar, E. Gerich, D. Kessens, D. Meyer, T. Bates, D. Karrenberg,
M. Terpstra. June 1999.
RFC 2650 Using RPSL in Practice. D. Meyer, J. Schmitz, C. Orange, M.
Prior, C. Alaettinoglu. August 1999.
Internet Routing Registries contain the routing policies of various ISPs, see :
http://www.ripe.net/ripencc/pub-services/whois.html
http://www.arin.net/whois/index.html
http://www.apnic.net/apnic-bin/whois.pl
If link AS10-AS20 goes down, AS20 will not consider anymore the path learned from AS10. It will thus remove this path from its routing table and will instead select the path learned from AS40. This will force AS20 to send the following UPDATE to AS30 :
Note that in RPSL, the set localpref construct does not exist. It is replaced with action preference=x. Unfortunately, in RPSL the routes with the lowest preference are preferred. RPSL uses thus the opposite of local-pref....
In practice, the exchange of BGP UPDATE messages will cease due to the utilization of timers by BGP routers and the routing will stabilize on one of the two stable route assignments.
This local-pref settings corresponds to the economical relationships between the various ASes.
Since AS1 is paid to carry packets towards Cust1 and Cust2, it will select a route towards those networks whenever possible.
Since AS1 does not need to pay to carry packets towards Peer1-4, AS1 will select a route towards those networks whenever possible.
AS1 will only utilize the routes receive from its providers when there is no other choice.
It is shown in the following papers that this way of utilizing the local-pref attribute leads to stable BGP routes :
Lixin Gao, Timothy G. Griffin, and Jennifer Rexford, "Inherently safe backup
routing with BGP," Proc. IEEE INFOCOM, April 2001
Lixin Gao and Jennifer Rexford, "Stable Internet routing without global
coordination," IEEE/ACM Transactions on Networking, December 2001, pp.
681-692
The RPSL policy of AS1 could be as follows :
RPSL policy for AS1
aut-num: AS1
import: from Cust1 action set localpref=200; accept Cust1
from Cust2 action set localpref=200; accept Cust2
from Peer1 action set localpref=150; accept Peer1
from Peer2 action set localpref=160; accept Peer2
from Peer3 action set localpref=170; accept Peer3
from Peer4 action set localpref=180; accept Peer4
from Prov1 action set localpref=100; accept ANY
from Prov2 action set localpref=100; accept ANY
Due to the utilization of the local-pref attribute, some paths on the Internet are longer than their optimum length, see :
Lixin Gao and Feng Wang , The Extent of AS Path Inflation by Routing Policies, GlobalInternet 2002
See :
L. Subramanian, S. Agarwal, J. Rexford, and RH Katz. Characterizing the Internet hierarchy from multiple vantage points. In IEEE INFOCOM, 2002
LAN/MAN Standards Committee of the IEEE Computer Society. IEEE Standard for Information Technology - Telecommunications and information exchange between systems - local and metropolitan area networks - specific requirements - Part 3 : Carrier Sense multiple access with collision detection (CSMA/CD) access method and physical layer specification. IEEE, 2000. available from http://standards.ieee.org/getieee802/802.3.html
Ethernet addresses are usually printed as hexadecimal numbers, e.g.
alpha.infonet.fundp.ac.be (at 00:80:C8:FB:21:2B [ether] on eth0
cr1.info.fundp.ac.be at 00:50:BD:D0:E0:00 [ether] on eth0
backus.info.fundp.ac.be at 08:00:20:A6:62:8A [ether] on eth0
inspiron.infonet.fundp.ac.be at 00:50:04:8C:83:70 [ether] on eth0
corneille.info.fundp.ac.be at 00:20:AF:52:44:4B [ether] on eth0
See http://standards.ieee.org/regauth/oui/oui.txt for the list of allocations
This is the most widely used format, it is notably used to carry IP packets.
A good reference on Ethernet switches is
R. Seifert, J. Edwards, The All-New Switch Book, Wiley, 2008