Incident analysis using RIPE NCC tools - RIPE 61 and LINX 71

628 views

Published on

Presentation given at RIPE 61 and LINX 71

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
628
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Incident analysis using RIPE NCC tools - RIPE 61 and LINX 71

  1. 1. Incident analysiswith RIPE NCCtoolsAnalysing the RIS/Duke BGP incidentErik Romijn <eromijn@ripe.net>Senior Software Engineer
  2. 2. RIPE NCC information collection • Routing Information Service (RIS) _ Listens to and stores all BGP updates _ Receiving data form 600 peersErik Romijn 2
  3. 3. RIPE NCC information collection • DNS monitoring service (DNSMON) _ Monitors critical DNS infrastructure _ About 100 vantage points worldwideErik Romijn 3
  4. 4. RIPE NCC information collection • Test Traffic Measurements (TTM) _ One-way latency/jitter/loss & traceroutes _ About 100 nodes in full meshErik Romijn 4
  5. 5. Case study: RIPE NCC / Duke University BGP experimentErik Romijn 5
  6. 6. RIS experiments & announcements • RIS has a long tradition of supporting research • Second AS in the world to announce 4-byte AS numbers • Beacon prefixes from RIS available since 2002 _ Also a vital part of debogonizingErik Romijn 6
  7. 7. Case study: RIPE NCC BGP experiment• RIPE NCC conducted an experiment on 27-08-2010 _ An optional BGP attribute was announced _ This was a optional transitive attribute of 3000 bytes _ The announcement was valid according to RFC4271• Some routers corrupted the route and sent it _ Peers who saw this dropped the session• This caused disruption to some internet traffic Erik Romijn 7
  8. 8. Case study: 27 August 2010 • Announcement active from 08:41 to 09:08 UTC, using 93.175.144.0/24 • We later observed some negative impact • Immediately started an extensive investigation _ This pointed towards a Cisco IOS XR bug _ Sent out a very detailed private announcement _ Also provided Cisco with all details • Cisco released cisco-sa-20100827-bgpErik Romijn 8
  9. 9. Propagation of the announcement Other routers Other router Other router Other router Other routerErik Romijn 9
  10. 10. Propagation of the announcement RIS RIS @ AMS-IX AS65550 AS12654 Other routers Other router Other router Other router Other routerErik Romijn 10
  11. 11. Propagation of the announcement RIS RIS @ AMS-IX AS65550 AS12654 Other routers Other router Other router Faulty router Other routerErik Romijn 11
  12. 12. Propagation of the announcement RIS RIS @ AMS-IX AS65550 AS12654 Other routers Other router Other router Faulty router Other routerErik Romijn 12
  13. 13. Propagation of the announcement RIS RIS @ AMS-IX AS65550 AS12654 Other routers Other router Other router Faulty router Other routerErik Romijn 13
  14. 14. Propagation of the announcement Other routers Other router Other router Faulty router Other routerErik Romijn 14
  15. 15. Goal of the experiment • Research group from Duke University approached RIPE NCC to help • Their goal was to measure support for long optional transitive attributes _ Intended to be used for certificates for secure routing • They did not have an AS number or addresses • Provided RIPE NCC with a patched QuaggaErik Romijn 15
  16. 16. Expected results A.The route propagates with the attribute intact B.The route propagates, with some AS in the path removing the attribute C.The route propagates, but takes a different path because some ASes drop the route A and B were seen in 4-byte AS number tests.Erik Romijn 16
  17. 17. Impact of the experiment on the InternetErik Romijn 17
  18. 18. Unstable prefixes 100%Percentage of total prefixes (320000) 75% 50% 25% 0% 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 Time (UTC) Erik Romijn 18
  19. 19. E-mails per hour - 28-29 August 60 First LINX post First NANOG post 45 Traffic on AMS-IX, LINX & NANOGMails per hour 30 15 0 0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00 0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00 Time (UTC) Initial RIPE NCC announcement / Erik Romijn 19 first AMS-IX post
  20. 20. Unstable prefixes 1.5%Percentage of total prefixes (320000) 1.0% 0.5% 0% 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 Time (UTC) Erik Romijn 20
  21. 21. Length of invisibilities 0.15%Percentage of total prefixes (320000) 0.10% 8-10 UTC, July 30, 2010 (total: 0.24%) 8-10 UTC, Aug 20, 2010 (total: 0.11%) 8-10 UTC, Aug 26, 2010 (total: 0.26%) 8-10 UTC, Aug 27, 2010 (total: 0.69%) 0.05% 0.00% 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Duration of Invisibility (minutes) Erik Romijn 21
  22. 22. Length of invisibilities 0.15%Percentage of total prefixes (320000) 0.10% 8-10 UTC, July 30, 2010 (total: 0.24%) 8-10 UTC, Aug 20, 2010 (total: 0.11%) 8-10 UTC, Aug 26, 2010 (total: 0.26%) 8-10 UTC, Aug 27, 2010 (total: 0.69%) 0.05% 0.00% 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Duration of Invisibility (minutes) Erik Romijn 22
  23. 23. Critical DNS infrastructure (from DNSMON) • Root servers unaffected • 57% of TLDs unaffected • Minor effects for 38% of the TLDs _ Some dropped queries for one or two servers • More significant effects on 5% of the TLDsErik Romijn 23
  24. 24. Critical DNS infrastructureErik Romijn 24
  25. 25. View from a TTM probe in Prague, CZErik Romijn 32
  26. 26. Updates for IPv4 vs IPv6 30Updates per minute per 1000 prefixes 20 10 0 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 Time (UTC) Erik Romijn IPv4 IPv6 33
  27. 27. Updates for IPv4 vs IPv6 30 Most affect did n ed BG ot ca P sesUpdates per minute per 1000 prefixes rry IP sions 20 v6 ro utes 10 0 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 Time (UTC) Erik Romijn IPv4 IPv6 34
  28. 28. Unstable prefixes vs number of updates 30Updates per minute per 1000 prefixes 20 10 0 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 Time (UTC) Erik Romijn 35
  29. 29. View from a TTM probe in Prague, CZErik Romijn 36
  30. 30. Locality of effects - updates BGP Updates on all RIS locations (IPv4) Average upd/sec per full table peer 1000 800 600 400 200 0 NOTA PTT PAIX MSK-IX DE-CIX NYIIX MIX Netnod8:25 AM DIX-IE 8:35 AM 8:45 AM VIX 8:55 AM 9:05 AM CIXP 9:15 AM AMS-IX 9:25 AM 9:35 AM LINX LINX, London AMS-IX/NL-IX/GN-IX, Amsterdam CIXP, Geneva VIX, Vienna DIX-IE, Tokyo Netnod, Stockholm MIX, Milan NYIIX, New York DE-CIX, Frankfurt MSK-IX, Moscow PAIX, Palo Alto PTT, Sao Paulo Erik Romijn 37 NOTA, Miami
  31. 31. Locality of effects - withdrawals BGP Withdrawals on all RIS locations (IPv4) Average upd/sec per full table peer 400 320 240 160 80 0 NOTA PTT PAIX MSK-IX DE-CIX NYIIX MIX Netnod8:25 AM 8:35 AM DIX-IE 8:45 AM CIXP 8:55 AM 9:05 AM VIX 9:15 AM AMS-IX 9:25 AM 9:35 AM LINX LINX, London AMS-IX/NL-IX/GN-IX, Amsterdam CIXP, Geneva VIX, Vienna DIX-IE, Tokyo Netnod, Stockholm MIX, Milan NYIIX, New York DE-CIX, Frankfurt MSK-IX, Moscow PAIX, Palo Alto PTT, Sao Paulo Erik Romijn 38 NOTA, Miami
  32. 32. Locality of effects - vendors per IX LINX NYIIX AMS-IX 42% 54% 58% 4% 9% 9% 5% 4% 2% 5% 9% 14% 28% 23% 33% DE-CIX VIX Cisco Juniper 50% Brocade 7% 9% 67% 2% Intel 7% 4% 2% Other 18% 34%Erik Romijn 39
  33. 33. Lessons learned • Future experiments should be pre-announced with sufficient lead time • Detected vulnerabilities should be handled with more care • More comprehensive impact assessments are needed • Your input is welcome: <ris@ripe.net>Erik Romijn 40
  34. 34. Questions?Erik Romijn <eromijn@ripe.net>
  35. 35. Updates per prefix range 3000Updates per 1000 prefixes per minute 2250 1500 750 0 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 Time (UTC) Erik Romijn Updates for prefixes 0-90 Updates for prefixes 100-255 42
  36. 36. AS path length 5 4Average AS path length in updates 3 2 1 0 8:00 8:10 8:20 8:30 8:40 8:50 9:00 9:10 9:20 9:30 9:40 9:50 10:00 10:10 10:20 10:30 10:40 10:50 Time (UTC)Erik Romijn 43

×