Presentation at LACNIC21 by Mat Ford on some Internet Society projects that are underway relating to the resilience and security of the Internet routing system.
2. The Challenge
Economic factors
– Externalities, information asymmetry, free riding
Technical factors
– Technology building blocks
– Common understanding of the problem
– Common understanding of solutions
Social factors
– Collective responsibility
– Collaborative spirit
3. Global Internet Routing Infrastructure
Our global commons
– We all depend on and benefit from it
Far reaching effects
– Configuration errors, malicious actors
– Example: Indosat event
Interconnectivity and interdependence
– “Inward” and “Outward” risks
– Example: 300Gbps attack on Spamhaus
5. How “risky” is the global routing system?
How often do incidents happen?
– Routing Resilience Measurements Workshop
http://www.internetsociety.org/doc/report-routing-resiliency-measurements-
workshop
– Frequency very much depends on the threshold for false positives
What is the impact?
– Data are missing, sensitive or not collected at all
– Risk assessment is a guess at best
Is your network affected?
– Detect incidents
– Eliminate false positives
– Assess the impact
Are you adequately protected?
7. Data collection
Network Information
– Once, during the initial sign up.
– Network type, connectivity, and practices used in mitigating routing
security incidents. It should take approximately 10-15 minutes to fill out
the registration form.
Data related to routing security incidents via an automated
monitoring effort
– On first login a “historical” overview will be presented, listing detected
suspicious events over last 6-12 months
– After that once a week newly detected suspicious events are collected
and displayed in the portal
– Participants are asked to validate and classify these events
Impact: severe, moderate, insignificant, not an incident
Detection: monitoring system, customer call, this alert
9. Evidence based risk analysis
64500
64500
64500
64500
64500
64500
Check and Classify
10. Confidentiality concerns
We understand the sensitivity of some of the data
involved in this effort. Therefore, the Internet Society is
committed to ensuring participant-specific information
remains confidential.
All data collected is stored on Internet Society servers.
Any information or analyses shared beyond a specific
network will be fully anonymized.
14. How did you learn about the event?
NMS Alert
Customer Call
RRS Alert
Not an incident
15. Interested in Participating?
If you decide to participate, please send a request for
the creation of your account to rrs-admin@isoc.org.
In the request please indicate
– your AS number and
– e-mail address for notifications.
You may also include AS numbers of your customers for
which you would like to monitor and classify related
security incidents.
17. Routing Resilience Manifesto
- Principles of addressing issues of routing
resilience
- Interdependence and reciprocity (including collaboration)
- Commitment to Best Practices
- Encouragement of customers and peers
- Guidelines indicating the most important
requirements
- BGP Filtering
- Anti-spoofing
- Coordination and collaboration
19. Objectives
•Raise awareness and encourage actions
by demonstrating commitment of the
growing group of supporters
•Demonstrate industry ability to address
complex issues
•Provide guidance
We’re trying to answer the question ‘how risky is the global routing system?’ – answer is important to understand motivation to take measures to protect routing system. Also a hard question to answer.
Two components – how often incidents happen? – convened a measurements workshop in 2012 to share data. Data varies. Hard to differentiate fat fingers from malicious behaviour.
Second component – what is the impact? – this is even trickier to answer – this is sensitive data, and cannot be observed from BGP tables. Little correlation between observable characteristics of events and the impact.
So, we designed the Routing Resilience Survey. We ask operators to participate and classify events related to their networks.
This is the portal where people can log in and classify events.
At sign up we ask for some characteristics of the network – relatively easy to answer, takes about 10 minutes first.
Thereafter, participants get weekly reports of any any detected suspicious events
- what is impact
- how was it detected – customer call, role monitoring system, or the alert from RRS
This is the kind of report participants get. We are partnering with BGPmon.net – reports are generated by BGPmon.
Participants are asked to check events and classify them.
We understand that this is sensitive data. We are partnering with BGPmon.net – monitoring service that has several vantage points and detects changes to BGP tables.
All sensitive information including classification is being stored by ISOC. This approach has allowed the project to proceed. ISOC perceived to be neutral.
This is work in progress – started in November 2013. Initially intended to run for 6 months, but will run for longer to get more statistically relevant data.
When participants join they are presented with historical data, partly as a teaser to encourage participation, but also to try to obtain some historical classifications. Hence charts extend back in time before RRS started.
Participants bring customer networks, so networks > participants.
Not all events are classified (different levels of enthusiasm from participants) but more than half are classified. Very grateful to participants for this considerable effort.
This is some preliminary results, there is no analysis – we will do data analysis and publish a report after the conclusion of the project.
This is some data on impact severity. Lots of green shows false positives. But there are also some red events, and orange events, and they aren’t all that infrequent.
42% unclassified. If they were classified then we would have more than half of incidents is not an incident ( could be configuration change, or adding a new customer).
4% of events have some noticeable impact, sever or moderate.
Looking at how participants learned about these events.
Customer call is prevailing method of detection. So our attitude to routing security is very reactive, not really proactive.
RRS alert is visible as well, which is interesting (these are alerts generated by our system).
So, I have a request for you all. If you could participate in this project, please do! We’re still happy to receive more participants. If you know someone who could participate, please encourage them to do so.
We have already fairly global participation, but we’d like to have more participants to get a more statistically representative picture of what is going on.
Provide a framework for ISPs to better understand and help address issues related to resilience and security of the Internet global routing system
in practical sense not overly ideal, but realistic, something that "good" netizens can subscribe to
but include the picture of how "good looks like" as an aspirational goal
Encourage ISPs to take measures aimed at improving the resiliency and security of the routing system
Demonstrate industry potential in addressing issues of resilience and security of the Internet global routing system in the spirit of collective responsibility