Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Case Studies in
Intra-Domain Routing Instability

              Zhang Shu
  National Institute of Information and
  Commun...
Overview
Intra-domain routing instability
Measurements of intra-domain
routing instability
• WIDE Internet and APAN Tokyo-...
Intra-Domain Routing Instability
Intra-domain routing instability
• Unexpected routing changes within an IGP
  routing dom...
Measurement Methodology
Data collection
• OSPF
• Tcpdump


  OSPF cloud

               Data collector



                ...
Measurement Methodology (Cont’d)
Data analysis
• Counting routing changes
   Changes in the content of an LSA
   LSA flush...
Case Study 1/2: WIDE Internet
WIDE Internet
• WIDE Project (http://www.wide.ad.jp)
• Connects hundreds of academic
  organ...
Measurement of the WIDE Internet
         Router-LSA




Period: August 2000 – May 2004
Measurement of the WIDE Internet (Cont’d)


Network-
  LSA



 Network-
Summary-
   LSA

  ASBR-
Summary-
   LSA

        ...
Example of a Typical LSA Oscillation




 Relatively frequent changes in
 short term
  • A router in Fukuoka (WIDE),
    5...
Example of Serious Oscillation




Frequent changes in short term
• An L3 switch, 6/12/03-6/13/03, lasted for
  about 18 h...
Long-Term Changes




Relatively frequent changes
• A router in SF, lasted for 5
  months (10/23/03-4/1/04)
   Considered ...
Long-Term Changes (Cont’d)




Slow changes
• A router in Kyoto, has persisted since
  this March
Some of them were caused...
The Case of OSPFv3




Period: July 2003 – January 2004
Case Study 2/2: APAN Tokyo-XP




 APAN Tokyo-XP network
  • A transit network located in Tokyo
  • Relatively small in sc...
Measurement of APAN Tokyo-XP Network
         (OSPFv2, Router-LSA)




               Problem of
               ATM link

...
Causes of Instability
Identified causes
• Congestion
   DDoS
• Link failure
• Software/Hardware bug
• Misconfiguration
Mos...
Analysis Results

Observed Routing Instability
• Instability observed on both the
  WIDE Internet and the APAN Tokyo-
  XP...
Analysis Results (Cont’d)
Changes is decreasing
• The change in router’s implementation
• Less network congestion because ...
Rtanaly: A Tool to Detect and Visualize
   Intra-Domain Routing Instability
Functions
• Detection of IGP change in real-ti...
Troubleshooting Routing Instability
Why is routing instability
troubleshooting difficult?
 • Problems occur intermittently...
Troubleshooting Routing Instability (Cont’d)
Data that should be collected
 •   Traffic volume
 •   Interface status
 •   ...
Conclusions
Routing instability measurements
• Intra-domain routing instability can
  occur frequently and persistently
• ...
Acknowledgements
My thanks to
• WIDE Project and Nara Institute of
  Science and Technology
• Operators of APAN Tokyo-XP n...
Intra-domain routing stability
measurement project
• http://pe0.koganei.wide.ad.jp/rtanaly
Please contact us if you are
in...
Upcoming SlideShare
Loading in …5
×

Case Studies on Intra-Domain Routing Instability

472 views

Published on

  • Be the first to comment

  • Be the first to like this

Case Studies on Intra-Domain Routing Instability

  1. 1. Case Studies in Intra-Domain Routing Instability Zhang Shu National Institute of Information and Communications Technology, Japan NANOG31 San Francisco, 2004/5/25
  2. 2. Overview Intra-domain routing instability Measurements of intra-domain routing instability • WIDE Internet and APAN Tokyo-XP network Dealing with intra-domain routing instability • Detection and troubleshooting Conclusions
  3. 3. Intra-Domain Routing Instability Intra-domain routing instability • Unexpected routing changes within an IGP routing domain • Causes packet loss, increased router load, and wasted bandwidth Why focus on intra-domain routing? • Compared with inter-domain routing, research on IGP behaviors is still poor • Help operators better understand intra- domain routing instability and learn how to deal with it
  4. 4. Measurement Methodology Data collection • OSPF • Tcpdump OSPF cloud Data collector Ethernet
  5. 5. Measurement Methodology (Cont’d) Data analysis • Counting routing changes Changes in the content of an LSA LSA flush Changes in AS-External LSAs were excluded • Refreshing LSAs were not counted
  6. 6. Case Study 1/2: WIDE Internet WIDE Internet • WIDE Project (http://www.wide.ad.jp) • Connects hundreds of academic organizations • About 50 routers in the OSPF backbone area Data collected at NARA-NOC • Located in Nara, Japan • Both OSPFv2 and OSPFv3 data collected
  7. 7. Measurement of the WIDE Internet Router-LSA Period: August 2000 – May 2004
  8. 8. Measurement of the WIDE Internet (Cont’d) Network- LSA Network- Summary- LSA ASBR- Summary- LSA Period: August 2000 – May 2004
  9. 9. Example of a Typical LSA Oscillation Relatively frequent changes in short term • A router in Fukuoka (WIDE), 5/7/2004, lasted for about 4 hours Usually caused by congestion
  10. 10. Example of Serious Oscillation Frequent changes in short term • An L3 switch, 6/12/03-6/13/03, lasted for about 18 hours Observed for several times • Most of them were caused by problems of p2p links or misconfiguration of using the same router ID on two routers
  11. 11. Long-Term Changes Relatively frequent changes • A router in SF, lasted for 5 months (10/23/03-4/1/04) Considered due to a switch problem
  12. 12. Long-Term Changes (Cont’d) Slow changes • A router in Kyoto, has persisted since this March Some of them were caused by interface problems
  13. 13. The Case of OSPFv3 Period: July 2003 – January 2004
  14. 14. Case Study 2/2: APAN Tokyo-XP APAN Tokyo-XP network • A transit network located in Tokyo • Relatively small in scale, with no more than ten routers in the backbone area
  15. 15. Measurement of APAN Tokyo-XP Network (OSPFv2, Router-LSA) Problem of ATM link Switch Misconfiguration problem Period: August 2003 – May 2004
  16. 16. Causes of Instability Identified causes • Congestion DDoS • Link failure • Software/Hardware bug • Misconfiguration Most instability is due to other reasons rather than routing protocol problems
  17. 17. Analysis Results Observed Routing Instability • Instability observed on both the WIDE Internet and the APAN Tokyo- XP network • The most typical changes are relatively frequent short-term ones Happen at intervals of 10 - 200s • Frequent short-term changes • Long-term changes
  18. 18. Analysis Results (Cont’d) Changes is decreasing • The change in router’s implementation • Less network congestion because of the increased bandwidth in recent years The causes of many changes are unknown
  19. 19. Rtanaly: A Tool to Detect and Visualize Intra-Domain Routing Instability Functions • Detection of IGP change in real-time and alert operators Can also be used for offline data analysis • Visualization • Accessible through the WWW interface Currently only supports OSPF • IS-IS support will be completed soon
  20. 20. Troubleshooting Routing Instability Why is routing instability troubleshooting difficult? • Problems occur intermittently, so it is difficult to get useful data for troubleshooting Event-driven data collection • Automatically obtain data for troubleshooting when detecting routing changes
  21. 21. Troubleshooting Routing Instability (Cont’d) Data that should be collected • Traffic volume • Interface status • Information on the routing protocols From where? • The router that originated the changing LSA • Network equipment connected to the router Switch How to collect the data? • SNMP
  22. 22. Conclusions Routing instability measurements • Intra-domain routing instability can occur frequently and persistently • Similar phenomenon may occur on other networks It is important to deploy a monitoring system on your own network Rtanaly Troubleshooting • Event-driven data collection
  23. 23. Acknowledgements My thanks to • WIDE Project and Nara Institute of Science and Technology • Operators of APAN Tokyo-XP network • Prof. Youki Kadobayashi for the idea on troubleshooting
  24. 24. Intra-domain routing stability measurement project • http://pe0.koganei.wide.ad.jp/rtanaly Please contact us if you are interested in conducting an IGP measurement on your network • zhang@koganei.wide.ad.jp Thank you!

×