Presentation given at the Géant3 NMS workshop (GN3/NA3/T4 Campus Best Practices) in Belgrade, 20th October 2009
An overview of where NAV came from, what it does and how it is developed.
Scaling API-first – The story of a global engineering organization
The campus NMS tool NAV
1. The Campus NMS tool NAV
GN3 Network monitoring workshop
Belgrade, 20th
October 2009
Morten Brekkevold
2. 2
What is NAV?
Network Administration Visualized
A network monitoring software system
Free software, licensed under GPLv2
Developed in Norway to meet the
requirements of campus network
operators in Norwegian higher education
Focused on network infrastructure
3. 3
1999: Inception
Need to monitor the campus network
HP OpenView tested and thrown out
Internal development of tailored tools
began
4. 4
Early features
Collect router port data (SNMP)
Auto-create MRTG/Cricket config
Status monitor (ping) with alerts
ARP collector
Web reports and topological traffic
map
SNMP trap daemon with alerts
5. 5
2001: Enter UNINETT
Sponsoring 50% of development costs
Contingent on results being made
available for all Norwegian universities
and university colleges
Resulted in version 2, released in 2002
8. 8
Today
An estimated 35 universities and
university colleges in Norway use NAV
(nearly all)
30 of these local installations operated
centrally by UNINETT
Reported usage from universities and
businesses in Italy, Romania, Russia,
Switzerland, UK, USA and Denmark
188 subscribers to the nav-users
mailing list
9. 9
What does it do?
Inventory & topology
Status monitoring & alerts
Client machine tracking and detention
Statistics and graphing
13. 13
Start monitoring
No device autodiscovery
“Seed the database”
getDeviceDatagetDeviceDatagetDeviceData PostgreSQL
14. 14
Inventory
For each device, get:
Modules
Serial numbers
Interfaces
IP addresses and prefixes
IPv4 and IPv6
15. 15
getDeviceData
Plugin based SNMP collector
Data is collected from devices in
parallel, using threads
Collects full inventory every 6 hours,
by default
Module monitor plugin is invoked every
hour, by default
23. 23
Traffic statistics
NAV autoconfigures
Cricket
A third party tool for
collecting and displaying
time-series data, using
RRDtool
We poll interface counters,
CPU load values, memory
usage, temperature, etc.
25. 25
Withholding alerts
Shadow
Uses topology to see that a device is
unreachable because of another device
being down
Scheduled maintenance
Purposely withhold alerts from devices
on scheduled maintenance
26. 26
Alert profiles
Each user can have multiple personal
alert profiles
A profile defines:
What alerts to subscribe to
When to receive said alerts
Where to receive said alerts
29. 29
Campus abuse handling
NAV offers two useful and popular
tools for campus abuse handling
Client machine tracking
Client machine detention
Why are these popular?
Student villages often connected
directly to university network
Students are naughty
30. 30
Client machine tracking
NAV logs CAM table entries for all
switch access ports
Using the combination ARP (and IPv6
neighbor discovery) & CAM, any client
machine's access port can be found
from its IP address
Similarly, we can count or list the
number of active end users on a switch
32. 32
Client machine detainment
Given an IP address, we can:
Block the client machine's access port
(interface shutdown)
Switch access port to a quarantine
VLAN with limited access
33. 33
Detention case history
Complete case history
Detention reason
Target MAC address
Option to “pursue” detainee
Option to automatically repeal
detention after set time has passed
34. 34
Automated detention runs
Scan client IP ranges for vulnerabilities
known to disrupt network
Feed list of vulnerable IP addresses to
NAV, supplying a reference reason
Clients are blocked according to
preconfigured settings in reference
reason
35. 35
Implementation
Began as hodge-podge mix of scripts
First use Perl and PHP
Add some Java
Then some more Java
Then throw in Python for good
measure
What a mess!?
36. 36
Development model
Many summer interns (students)
More than 30 people involved over a
span of 10 years of development
Always a new “favorite” programming
language
Turnaround is a problem for code
maintenance
37. 37
Integration and cleanup
Since 2003, new programming
languages were forbidden
Code cleanup, rewrites and
encouraging API building
Reducing number of languages
PHP is out
Perl is almost out (1 program left)
Java accounts for nearly 50%, but is
very slowly on its way out
38. 38
Active developers
UNINETT
1 full-time employee (me)
4 part-time students
NTNU
1 person, 25% of the time
University of Tromsø
2 people, ad-hoc
University of Oslo
Packaging for Debian GNU/Linux
39. 39
Development tools
Launchpad
Bug and specification tracking
Mercurial
Distributed version control
Emacs, Vim, Eclipse, etc.
Sympa
Mailing list software
Dokuwiki
Wiki-based web site
40. 40
Future plans
Currently working on next-generation
collection framework
Working on improved environment
and UPS monitoring
LLDP support for topology
Integrate Geomap
SNMPv3 and/or Netconf