Route leaks occur when improper network prefixes are advertised and propagated throughout the network leading to packet loss. Having an automated approach to detect and correct route leaks is beneficial. In this research, an Enterprise environment was emulated with Autonomous Systems (AS) running Border Gateway Protocol (BGP) and a server storing telemetry route data. A network design that detects route leak efficiently and blocks the leaked traffic with the Engineer’s approval was implemented by using prefix lists on edge devices. The proposed detection mechanism considers factors like BGP origin AS, timestamp, and network prefix. The Routing Information Base (RIB) data is stored and analyzed in a database from which the developed algorithm fetches the data for route leak detection, and the alerting system notifies the hijacked AS when a route leak is detected. The results of the algorithm developed enables real-time route leak detection, alerts the respective enterprise, and blocks the network within the local enterprise upon user’s approval.
1. Data Analytics For
Service Provider Networks
Dr. Levi Perigo
Academic Advisor
Dr. Kevin Gifford
Course Instructor
Dewang Gedia
Academic Advisor
Kartik Bhandary Manasa Suresh Rodney Manuel Sandeep Surendher Siddharth Shah Sowmya Sundaram
2. Introduction
Levels of Success
Concept of Operation
Implementation
Performance Results
Why our solution?
Future Work
Conclusion
Agenda
3. Introduction
Problem
Purpose
Causes
Improper network prefixes propagated and preferred – Route
Leaks
Border Gateway Protocol is not designed with Security in mind
How Route Leaks happen?
(1) Route Misconfiguration
(2) Prefix Hijacking
Route leaks result in packet drops, congestion and snooping
Reduce the impact of route leak by instantaneous detection and
correction
4. Current Solution Proposed solution
The prefix is identified by the person
experiencing the problem
Routing tables are analyzed manually to
identify the threat
Time required : 2 to 3 days
Automated method of prefix Hijack
detection
Algorithm to detect if it is valid or invalid
If invalid, with user approval, take action
Time required: less than 10 minutes
Introduction
5. Levels of Success
Level
1
1. Design a test environment with BGP peering
2. Integrate data with the analytics platform
3. Establish Internal container connectivity
Level
2
1. Framework to collect routing information
2. Develop an algorithm for route leak detection
3. Design real-time alerting upon route leak
detection
Level
3
1. Develop an algorithm to perform self-healing
2. Ensure the algorithm is scalable and reliable
3. Check the compatibility across various
platforms
6. Route View Collector
Telemetry Data
Collector
1
Collector Pipeline
Data Sent For Analyzing
2
Distributed
streaming
platform
Algorithm!
Database
Storing
data
Time Analysis
3
Visualization
User Interface
Check if self
healing is
needed or
not
4
reply
5
Self healing Configs
6
Concept of Operations
7. Leak detection code output
Automated E-mail for alerting
the NOC team
Web page to specify
user’s choice of action
Corrective firewall
rule
Implementation
Monitoring Origin AS vs Prefixes
8. Results Time performance
Detect a route leak Average: 3 minutes
Confirm if it is a leak with NOC team
(manual)
5 minutes
Run the correction code to block prefix
(n = 4 routers)
1 minute
Overall time from detection to
correction
Less than 10 minutes
Performance Results
9. 212
MAINONEChina TelJSC Comp
212 prefixes were
added by MAINONE
21:13 UTC22:27 UTC
21:13 UTC
Detect Confirm Correct
3 min 5 min 1 min
MAINONEJSC Comp China Tel
21:23 UTC
Performance Results – Google Route leak
Without our application (74 min)
With our application (10 min)
11. Extended to a large scale
network environment
Monitoring traffic flows to
check for any sudden changes
AS-path monitoring to check
the exact route taken by traffic
Run the correction code
globally to affect the remote
AS
Future Work
12. Conclusion
Focused on improving one performance metric -
latency - which is the time taken from detection to
correction
Learnt a great deal about BGP metrics and route leaks
and the kind of impact a route leak can have on a
global level
Tried to keep the solution as simple as possible