While delivering VoIP solutions to customers for more than ten years, at sipgate we have gained experience in monitoring our VoIP setup. The talk will give an insight on how to monitor Asterisk, Kamailio, Yate and other vital parts of our setup through standard checks and own scripts. We will not only show how to monitor standard SIP, but also how to detect bottlenecks and misfunctions.
2. Who we are, what we do
• Düsseldorf based VoIP provider (since 2004)
• active in Germany and UK
• Private and Business customers
• VoIP and Mobile products
• some 100k users
• almost 100 million minutes each month
3. VoIP systems monitored
• Asterisk (~100 servers)
• Kamailio (12 servers)
• Yate (12 servers)
• RTP Proxy (12 servers)
• ASR (http://en.wikipedia.org/wiki/Answer-seizure_ratio)
4. Our Monitoring systems
• 2 Icinga servers
• almost 1k hosts
• more than 5k services
• Cacti
• Observium for network monitoring
5. Monitoring SIP
• simple Perl script
• UDP capable (TCP and TLS coming soon)
• resolves SRV DNS records, checks all targets
• Watch the response code!
• different systems answer differently
8. check_manager.pl
• first we called it separately for each service
• high load on the monitoring system
• now: one script fills all services
• only the MANAGER service is active, all others are
passive (with fallback command)
• can be called as active check for each service
9. check_manager.pl
sub push_passive {
my ($service,$state,$msg) = @_;
my $timestamp = time;
eval {
open CMD, ">>", $cmdfile or die $!;
};
if ($@) {
print "Could not open Command file!n" if (defined($opts_verbose));
return;
}
my $cmdmsg = sprintf("[%s] PROCESS_SERVICE_CHECK_RESULT;%s;%s;%s;%sn",
$timestamp,$opts_host,$service,$ERRORS{$state},$msg);
print $cmdmsg if (defined($opts_verbose));
print CMD $cmdmsg;
close CMD;
}
10. Integrating into Icinga
• Service Definition
Active Check Passive Check
define service {
service_description MANAGER
hostgroup_name sipgw
use local-service
check_command check_manager
}
define service {
service_description AST_UPTIME
hostgroup_name sipgw
use local-service
check_command check_manager_active!uptime
is_volatile 1
active_checks_enabled 0
passive_checks_enabled 1
check_freshness 1
max_check_attempts 1
freshness_threshold 600
}
12. Check Open Files
• consists of two scripts on the monitored system
• one script run by cron every minute
• other script triggered by SNMPd to read those files
13. Monitoring Kamailio
• SIP
• some variables through XMLRPC calls
• Memory
• TCP Connections
• Version
19. Monitoring ASR
• percentage of answered calls / total calls
• additionally: length of answered calls
• per Gateway, Carrier, Destination, Product
• in Yate: configurable and readable via SNMP
20. Monitoring the rest
• SIP connectivity to partners
• Function tests (emergency calls, features)
• ENUM
• User Location
• RTP Proxies
• STUN
• iptables Connection Tracking
21. That’s it
Downloads available at: http://sipg.at/osmc2014
And: We hire, too!
http://www.sipgate.de/jobs/