Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • One traceroute data consistent with minimum over first week of June.
  • By ‘independent’, we mean parts of the AS paths don’t overlap with any other. This is caused by the BGP import and export policies of the ASes involved, and can be explained by the economic incentives of inter-connecting networks. Data unstable, but all show TIVs – so we use a particular measurement as illustration.
  • ppt

    1. 1. COMS W4995-1 Lecture 6
    2. 2. Dynamic routing protocols II <ul><li>Dynamic Routing Protocols: Link State Routing </li></ul><ul><li>Intra-Domain Routing Protocols: OSPF & BGP </li></ul>
    3. 3. Dynamic Routing Protocols Link State Routing
    4. 4. The Gang of Four Link State Vectoring EGP IGP BGP RIP IS-IS OSPF
    5. 5. Link State Routing <ul><li>Based on Dijkstra’ s Shortest-Path-First algorithm. </li></ul><ul><li>Each router starts by knowing: </li></ul><ul><ul><li>Prefixes of its attached networks. </li></ul></ul><ul><ul><li>Links to its neighbors. </li></ul></ul><ul><li>Each router advertises to the entire network (flooding): </li></ul><ul><ul><li>Prefixes of its directly connected networks. </li></ul></ul><ul><ul><li>Active links to its neighbors. </li></ul></ul><ul><li>Each router learns: </li></ul><ul><ul><li>A complete topology of the network (routers, links). </li></ul></ul><ul><li>Each router computes shortest path to each destination. </li></ul><ul><li>In a stable situation, all routers have the same graph, and compute the same paths. </li></ul>
    6. 6. Dijkstra’s Shortest Path Algorithm for a Graph Input: Graph (N,E) with N the set of nodes and E the set of edges c vw link cost (c vw = 1 if (v,w)  E, c vv = 0) s source node. Output : D n cost of the least-cost path from node s to node n M = {s}; for each n  M D n = c sn ; while (M  all nodes) do Find w  M for which D w = min{D j ; j  M}; Add w to M; for each neighbor n of w and n  M D n = min[ D n , D w + c wn ]; Update route; end for end while end for
    7. 7. Link state routing: graphical illustration a b c d 3 1 6 2 Collecting all views yield a global & complete view of the network! Global view: a 3 6 b c a’s view: a b c 3 1 b’s view: c d 2 d’s view: a b c d 1 6 c’s view: 2
    8. 8. Operation of a Link State Routing protocol Received LSAs IP Routing Table Dijkstra’s Algorithm Link State Database LSAs are flooded to other interfaces
    9. 9. Link State Routing: Properties <ul><li>Each node requires complete topology information </li></ul><ul><li>Link state information must be flooded to all nodes </li></ul><ul><li>Guaranteed to converge </li></ul>
    10. 10. Distance Vector vs. Link State Routing <ul><li>With distance vector routing, each node has information only about the next hop: </li></ul><ul><ul><ul><li>Node A: to reach F go to B </li></ul></ul></ul><ul><ul><ul><li>Node B: to reach F go to D </li></ul></ul></ul><ul><ul><ul><li>Node D: to reach F go to E </li></ul></ul></ul><ul><ul><ul><li>Node E: go directly to F </li></ul></ul></ul><ul><li>Distance vector routing makes poor routing decisions if directions are not completely correct (e.g., because a node is down). </li></ul><ul><li>If parts of the directions incorrect, the routing may be incorrect until the routing algorithms has re-converged. </li></ul>A B C D E F
    11. 11. Distance Vector vs. Link State Routing <ul><li>In link state routing, each node has a complete map of the topology </li></ul><ul><li>If a node fails, each node can calculate the new route </li></ul><ul><li>Difficulty: All nodes need to have a consistent view of the network </li></ul>A B C D E F A B C D E F A B C D E F A B C D E F A B C D E F A B C D E F A B C D E F
    12. 12. Distance Vector vs. Link State Routing Link State Vectoring <ul><li>Topology information is flooded within the routing domain </li></ul><ul><li>Best end-to-end paths are computed locally at each router. </li></ul><ul><li>Best end-to-end paths determine next-hops. </li></ul><ul><li>Based on minimizing some notion of distance </li></ul><ul><li>Works only if policy is shared and uniform </li></ul><ul><li>Examples: OSPF, IS-IS </li></ul><ul><li>Each router knows little about network topology </li></ul><ul><li>Only best next-hops are chosen by each router for each destination network. </li></ul><ul><li>Best end-to-end paths result from composition of all next-hop choices </li></ul><ul><li>Does not require any notion of distance </li></ul><ul><li>Does not require uniform policies at all routers </li></ul><ul><li>Examples: RIP, BGP </li></ul>
    13. 13. Dynamic Routing Protocols O pen S hortest P ath F irst
    14. 14. <ul><li>OSPF = Open Shortest Path First </li></ul><ul><li>The OSPF routing protocol is the most important link state routing protocol on the Internet (another link state routing protocol is IS-IS (intermediate system to intermediate system) </li></ul><ul><li>The complexity of OSPF is significant </li></ul><ul><ul><li>RIP (RFC 2453 ~ 40 pages) </li></ul></ul><ul><ul><li>OSPF (RFC 2328 ~ 250 pages) </li></ul></ul><ul><li>History: </li></ul><ul><ul><li>1989: RFC 1131 OSPF Version 1 </li></ul></ul><ul><ul><li>1991: RFC1247 OSPF Version 2 </li></ul></ul><ul><ul><li>1994: RFC 1583 OSPF Version 2 (revised) </li></ul></ul><ul><ul><li>1997: RFC 2178 OSPF Version 2 (revised) </li></ul></ul><ul><ul><li>1998: RFC 2328 OSPF Version 2 (current version) </li></ul></ul>OSPF
    15. 15. Features of OSPF <ul><li>Provides authentication of routing messages </li></ul><ul><li>Enables load balancing by allowing traffic to be split evenly across routes with equal cost </li></ul><ul><li>Type-of-Service routing allows to setup different routes dependent on the TOS field </li></ul><ul><li>Supports subnetting </li></ul><ul><li>Supports multicasting </li></ul><ul><li>Allows hierarchical routing </li></ul>
    16. 16. Hierarchical OSPF
    17. 17. Hierarchical OSPF <ul><li>Two-level hierarchy : local area, backbone. </li></ul><ul><ul><li>Link-state advertisements only in area </li></ul></ul><ul><ul><li>each nodes has detailed area topology; only know direction (shortest path) to nets in other areas. </li></ul></ul><ul><li>Area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers. </li></ul><ul><li>Backbone routers: run OSPF routing limited to backbone. </li></ul>
    18. 18. Example Network Router IDs can be selected independent of interface addresses, but usually chosen to be the smallest interface address <ul><li>Link costs are called Metric </li></ul><ul><li>Metric is in the range [0 , 2 16 ] </li></ul><ul><li>Metric can be asymmetric </li></ul> / 24 .1 .2 .2 / 24 / 24 .1 .4 / 24 / 24 / 24 / 24 .3 .3 .5 .2 .3 .5 .5 .4 .4 .6 .6 3 4 2 5 1 1 3 2
    19. 19. Link State Advertisement (LSA) <ul><li>The LSA of router is as follows: </li></ul><ul><li>Link State ID : = Router ID </li></ul><ul><li>Advertising Router: = Router ID </li></ul><ul><li>Number of links: 3 = 2 links plus router itself </li></ul><ul><li>Description of Link 1: Link ID =, Metric = 4 </li></ul><ul><li>Description of Link 2: Link ID =, Metric = 3 </li></ul><ul><li>Description of Link 3: Link ID =, Metric = 0 </li></ul>4 3 2 / 24 .1 .2 .2 / 24 / 24 .1 .4 / 24 / 24 / 24 / 24 .3 .3 .5 .2 .3 .5 .5 .4 .4 .6 .6
    20. 20. Network and Link State Database Each router has a database which contains the LSAs from all other routers LS Type Link StateID Adv. Router Checksum LS SeqNo LS Age Router-LSA 0x9b47 0x80000006 0 Router-LSA 0x219e 0x80000007 1618 Router-LSA 0x6b53 0x80000003 1712 Router-LSA 0xe39a 0x8000003a 20 Router-LSA 0xd2a6 0x80000038 18 Router-LSA 0x05c3 0x80000005 1680 / 24 .1 .2 .2 / 24 / 24 .1 .4 / 24 / 24 / 24 / 24 .3 .3 .5 .2 .3 .5 .5 .4 .4 .6 .6
    21. 21. Link State Database <ul><li>The collection of all LSAs is called the link-state database </li></ul><ul><li>Each router has an identical link-state database </li></ul><ul><ul><ul><ul><li>Useful for debugging: Each router has a complete description of the network </li></ul></ul></ul></ul><ul><li>If neighboring routers discover each other for the first time, they will exchange their link-state databases </li></ul><ul><li>The link-state databases are synchronized using reliable flooding </li></ul>
    22. 22. OSPF Packet Format Destination IP: neighbor’s IP address or (ALLSPFRouters) or (AllDRouters) TTL: set to 1 (in most cases) OSPF packets are not carried as UDP payload! OSPF has its own IP protocol number: 89
    23. 23. OSPF Packet Format 2: current version is OSPF V2 Message types: 1: Hello (tests reachability) 2: Database description 3: Link Status request 4: Link state update 5: Link state acknowledgement ID of the Area from which the packet originated Standard IP checksum taken over entire packet 0: no authentication 1: Cleartext password 2: MD5 checksum (added to end packet) Authentication passwd = 1: 64 cleartext password Authentication passwd = 2: 0x0000 (16 bits) KeyID (8 bits) Length of MD5 checksum (8 bits) Nondecreasing sequence number (32 bits) Prevents replay attacks
    24. 24. OSPF LSA Format LSA Header Link 1 Link 2
    25. 25. Discovery of Neighbors <ul><li>Routers multicasts OSPF Hello packets on all OSPF-enabled interfaces. </li></ul><ul><li>If two routers share a link, they can become neighbors, and establish an adjacency </li></ul><ul><li>After becoming a neighbor, routers exchange their link state databases </li></ul>Scenario: Router restarts
    26. 26. Neighbor discovery and database synchronization Sends empty database description Scenario: Router restarts Sends database description. (description only contains LSA headers) Database description of Acknowledges receipt of description After neighbors are discovered the nodes exchange their databases Discovery of adjacency
    27. 27. Regular LSA exchanges explicitly requests each LSA from sends requested LSAs Link State Request packets, LSAs = Router-LSA,, Router-LSA,, Router-LSA,, Router-LSA,, Router-LSA,, Router-LSA,, Link State Update Packet, LSAs = Router-LSA,, 0x80000006 Router-LSA,, 0x80000007 Router-LSA,, 0x80000003 Router-LSA,, 0x8000003a Router-LSA,, 0x80000038 Router-LSA,, 0x80000005
    28. 28. Dissemination of LSA-Update <ul><li>A router sends and refloods LSA-Updates, whenever the topology or link cost changes. (If a received LSA does not contain new information, the router will not flood the packet) </li></ul><ul><li>Exception: Infrequently (every 30 minutes), a router will flood LSAs even if there are not new changes. </li></ul><ul><li>Acknowledgements of LSA-updates: </li></ul><ul><ul><ul><li>explicit ACK, or </li></ul></ul></ul><ul><ul><ul><li>implicit via reception of an LSA-Update </li></ul></ul></ul><ul><li>Question: If a new node comes up, it could build the database from regular LSA-Updates (rather than exchange of database description). What role do the database description packets play? </li></ul>
    29. 29. Dynamic Routing Protocols (Inter-domain) B order G ateway P rotocol
    30. 30. BGP Quick View <ul><li>BGP = Border Gateway Protocol . Currently in version 4, specified in RFC 1771. (~ 60 pages) </li></ul><ul><li>Note: In the context of BGP, a gateway is nothing else but an IP router that connects autonomous systems. </li></ul><ul><li>Interdomain routing protocol for routing between autonomous systems </li></ul><ul><li>Uses TCP to establish a BGP session and to send routing messages over the BGP session </li></ul><ul><li>BGP is a path vector protocol. Routing messages in BGP contain complete routes. </li></ul><ul><li>Network administrators can specify routing policies </li></ul>
    31. 31. BGP Policy-based Routing <ul><li>Each node is assigned an AS number (ASN) </li></ul><ul><li>BGP’s goal is to find any AS-path (not an optimal one). Since the internals of the AS are never revealed, finding an optimal path is not feasible. </li></ul><ul><li>Network administrator sets BGP’s policies to determine the best path to reach a destination network. </li></ul>
    32. 32. How Many ASNs are there today? Thanks to Geoff Huston. http://bgp.potaroo.net on October 9, 2005 20,570 14,588 origin only (no transit)
    33. 33. Autonomous Routing Domains Don’t Always Need BGP or an ASN Qwest Yale University Nail up default routes pointing to Qwest Nail up routes pointing to Yale Static routing is the most common way of connecting an autonomous routing domain to the Internet. This helps explain why BGP is a mystery to many … ARDs versus ASes
    34. 34. ASNs Can Be “Shared” (RFC 2270) AS 701 UUNet ASN 7046 is assigned to UUNet. It is used by Customers single homed to UUNet, but needing BGP for some reason (load balancing, etc..) [RFC 2270] AS 7046 Crestar Bank AS 7046 NJIT AS 7046 Hood College
    35. 35. ARDs and ASes: Summary <ul><li>Most ARDs have no ASN (statically routed at Internet edge) </li></ul><ul><li>Some unrelated ARDs share the same ASN (RFC 2270) </li></ul><ul><li>Some ARDs are implemented with multiple ASNs (example: Worldcom) </li></ul>ASes are just an implementation detail of Inter-domain routing
    36. 36. How many prefixes today? Thanks to Geoff Huston. http://bgp.potaroo.net on October 9, 2005 IPv4 Address space covered 221,002 33.3% 23%
    37. 37. Policy-Based vs. Distance-Based Routing? ISP1 ISP2 ISP3 Cust1 Cust2 Cust3 Host 1 Host 2 Minimizing “ hop count” can violate commercial relationships that constrain inter- domain routing. Thanks to Tim Griffin http://www.cl.cam.ac.uk/users/tgg22 YES NO
    38. 38. Customer versus Provider Customer pays provider for access to the Internet provider customer IP traffic provider customer
    39. 39. Why not minimize “AS hop Count”? Shortest path routing is not compatible with commercial relations Regional ISP1 Regional ISP2 Regional ISP3 Cust1 Cust3 Cust2 National ISP1 National ISP2 YES NO
    40. 40. The “Peering” Relationship peer peer customer provider Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed
    41. 41. Peering Provides Shortcuts Peering also allows connectivity between the customers of “Tier 1” providers. peer peer customer provider
    42. 42. Peering Wars <ul><li>Reduces upstream transit costs </li></ul><ul><li>Can increase end-to-end performance </li></ul><ul><li>May be the only way to connect your customers to some part of the Internet (“Tier 1”) </li></ul><ul><li>You would rather have customers </li></ul><ul><li>Peers are usually your competition </li></ul><ul><li>Peering relationships may require periodic renegotiation </li></ul>Peering struggles are by far the most contentious issues in the ISP world! Peering agreements are often confidential . Peer Don’t Peer
    43. 43. The Border Gateway Protocol (BGP) BGP = RFC 1771 + “ optional” extensions RFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) … + routing policy configuration languages (vendor-specific) + Current Best Practices in management of Interdomain Routing BGP was not DESIGNED. It EVOLVED.
    44. 44. BGP Route Processing Best Route Selection Apply Import Policies Best Route Table Apply Export Policies Install forwarding Entries for best Routes. Receive BGP Updates Best Routes Transmit BGP Updates Apply Policy = filter routes & tweak attributes Based on Attribute Values IP Forwarding Table Apply Policy = filter routes & tweak attributes Open ended programming. Constrained only by vendor configuration language
    45. 45. BGP Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development From IANA: http://www.iana.org/assignments/bgp-parameters Most important attributes Not all attributes need to be present in every announcement
    46. 46. ASPATH Attribute AS7018 AS Path = 6341 AS 1239 Sprint AS 1755 Ebone AT&T AS 3549 Global Crossing AS Path = 7018 6341 AS Path = 3549 7018 6341 AS 6341 AT&T Research Prefix Originated AS 12654 RIPE NCC RIS project AS 1129 Global Access AS Path = 7018 6341 AS Path = 1239 7018 6341 AS Path = 1755 1239 7018 6341 AS Path = 1129 1755 1239 7018 6341
    47. 47. Shorter Doesn’t Always Mean Shorter In fairness: could you do this “right” and still scale? Exporting internal state would dramatically increase global instability and amount of routing state AS 4 AS 3 AS 2 AS 1 Mr. BGP says that path 4 1 is better than path 3 2 1 Duh!
    48. 48. Routing Example 1 Thanks to Han Zheng
    49. 49. Routing Example 2 Thanks to Han Zheng
    50. 50. Tweak Tweak Tweak (TE) <ul><li>For inbound traffic </li></ul><ul><ul><li>Filter outbound routes </li></ul></ul><ul><ul><li>Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection </li></ul></ul><ul><li>For outbound traffic </li></ul><ul><ul><li>Filter inbound routes </li></ul></ul><ul><ul><li>Tweak attributes on inbound routes to influence best route selection </li></ul></ul>outbound routes inbound routes inbound traffic outbound traffic In general, an AS has more control over outbound traffic
    51. 51. Backup Links with Local Preference (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 65000 Set Local Pref = 50 for all routes from AS 1
    52. 52. Multihomed Backups (Outbound Traffic) Forces outbound traffic to take primary link, unless link is down. AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 AS 2 Set Local Pref = 50 for all routes from AS 3 AS 3 provider provider
    53. 53. Shedding Inbound Traffic with ASPATH Prepending Prepending will (usually) force inbound traffic from AS 1 to take primary link AS 1 ASPATH = 2 2 2 customer AS 2 provider backup primary ASPATH = 2 Yes, this is a Glorious Hack …
    54. 54. … But Padding Does Not Always Work AS 1 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 customer AS 2 provider ASPATH = 2 AS 3 provider AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing backup primary
    55. 55. COMMUNITY Attribute to the Rescue! AS 1 customer AS 2 provider ASPATH = 2 AS 3 provider backup primary ASPATH = 2 COMMUNITY = 3:70 Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 AS 3: normal customer local pref is 100, peer local pref is 90
    56. 56. BGP Issues - What is a BGP Wedgie? <ul><li>BGP policies make sense locally </li></ul><ul><li>Interaction of local policies allows multiple stable routings </li></ul><ul><li>Some routings are consistent with intended policies, and some are not </li></ul><ul><ul><li>If an unintended routing is installed (BGP is “wedged”), then manual intervention is needed to change to an intended routing </li></ul></ul><ul><li>When an unintended routing is installed, no single group of network operators has enough knowledge to debug the problem </li></ul>Full wedgie ¾ wedgie
    57. 57. Dynamic Routing Protocols: Summary <ul><li>Dynamic routing protocols: RIP, OSPF, BGP </li></ul><ul><li>RIP uses distance vector algorithm, and converges slow (the count-to-infinity problem) </li></ul><ul><li>OSPF uses link state algorithm, and converges fast. But it is more complicated than RIP. </li></ul><ul><li>Both RIP and OSPF finds lowest-cost path. </li></ul><ul><li>BGP uses path vector algorithm, and its path selection algorithm is complicated, and is influenced by policies. </li></ul><ul><li>BGP has its own problems see WIDGI by Tim Griffin </li></ul>
    58. 58. More Readings (Optional) <ul><li>BGP Wedgies : Bad Routing Policy Interactions that Cannot be Debugged </li></ul><ul><li>JI’s Intro to interdomain routing . </li></ul><ul><li>&quot; Interdomain Setting of PlanetLab Nodes.&quot; PlanetLab Meeting, May 14, 2004. </li></ul><ul><li>Understanding the Border Gateway Protocol (BGP) </li></ul><ul><li>ICNP 2002 Tutorial Session </li></ul>
    59. 59. References <ul><li>[VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing. Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996) </li></ul><ul><li>[GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin, Gordon Wilfong. SIGCOMM 1999 </li></ul><ul><li>[GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F. Bruce Shepherd, Gordon Wilfong. ICNP 1999 </li></ul><ul><li>[GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong. INFOCOM 2001 </li></ul><ul><li>[GR2000] Stable Internet Routing without Global Coordination. Lixin Gao, Jennifer Rexford. SIGMETRICS 2000 </li></ul><ul><li>[GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G. Griffin, Jennifer Rexford. INFOCOM 2001 </li></ul><ul><li>[GW2002a] On the Correctness of IBGP Configurations. Griffin and Wilfong.SIGCOMM 2002. </li></ul><ul><li>[GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong. ICNP 2002. </li></ul>