Monitoring VoIP Systems 
Sebastian Damm 
damm@sipgate.de
Who we are, what we do 
• Düsseldorf based VoIP provider (since 2004) 
• active in Germany and UK 
• Private and Business customers 
• VoIP and Mobile products 
• some 100k users 
• almost 100 million minutes each month
VoIP systems monitored 
• Asterisk (~100 servers) 
• Kamailio (12 servers) 
• Yate (12 servers) 
• RTP Proxy (12 servers) 
• ASR (http://en.wikipedia.org/wiki/Answer-seizure_ratio)
Our Monitoring systems 
• 2 Icinga servers 
• almost 1k hosts 
• more than 5k services 
• Cacti 
• Observium for network monitoring
Monitoring SIP 
• simple Perl script 
• UDP capable (TCP and TLS coming soon) 
• resolves SRV DNS records, checks all targets 
• Watch the response code! 
• different systems answer differently
Monitoring Asterisk 
• Base monitoring (disk, memory, load) 
• SIP 
• Remote Manager 
• Asterisk Version, Config version, G729 status, 
Channels, Uptime 
• Open Files (SNMP extension) 
• IPtables status (SNMP extension)
Monitoring Asterisk
check_manager.pl 
• first we called it separately for each service 
• high load on the monitoring system 
• now: one script fills all services 
• only the MANAGER service is active, all others are 
passive (with fallback command) 
• can be called as active check for each service
check_manager.pl 
sub push_passive { 
my ($service,$state,$msg) = @_; 
my $timestamp = time; 
eval { 
open CMD, ">>", $cmdfile or die $!; 
}; 
if ($@) { 
print "Could not open Command file!n" if (defined($opts_verbose)); 
return; 
} 
my $cmdmsg = sprintf("[%s] PROCESS_SERVICE_CHECK_RESULT;%s;%s;%s;%sn", 
$timestamp,$opts_host,$service,$ERRORS{$state},$msg); 
print $cmdmsg if (defined($opts_verbose)); 
print CMD $cmdmsg; 
close CMD; 
}
Integrating into Icinga 
• Service Definition 
Active Check Passive Check 
define service { 
service_description MANAGER 
hostgroup_name sipgw 
use local-service 
check_command check_manager 
} 
define service { 
service_description AST_UPTIME 
hostgroup_name sipgw 
use local-service 
check_command check_manager_active!uptime 
is_volatile 1 
active_checks_enabled 0 
passive_checks_enabled 1 
check_freshness 1 
max_check_attempts 1 
freshness_threshold 600 
}
Integrating into Icinga 
• Command Definition 
Active Check Passive Check (Fallback) 
define command{ 
command_name check_manager 
command_line $USER32$/check_manager  
-H $HOSTNAME$ -I $HOSTADDRESS$  
-u $USER7$ -p $USER8$  
-m 127.0.0.1:11211  
-s SIPCHAN --g729 
} 
define command{ 
command_name check_manager_active 
command_line $USER32$/check_manager  
-H $HOSTNAME$ -I $HOSTADDRESS$  
-u $USER7$ -p $USER8$  
-m 127.0.0.1:11211  
-a $ARG1$ 
}
Check Open Files 
• consists of two scripts on the monitored system 
• one script run by cron every minute 
• other script triggered by SNMPd to read those files
Monitoring Kamailio 
• SIP 
• some variables through XMLRPC calls 
• Memory 
• TCP Connections 
• Version
Monitoring Kamailio
XMLRPC in Kamailio 
1. Load the module 
loadmodule "xmlrpc.so" 
modparam("xmlrpc", "route", "XMLRPC") 
2. Handle XMLRPC calls 
route["XMLRPC"] { 
if(src_ip == 1.2.3.4) { # only answer to Monitoring 
set_reply_no_connect(); # optional 
dispatch_rpc(); 
} else { 
xmlrpc_reply("403", "Forbidden"); 
} 
}
Querying Kamailio 
sub call_rpc { 
my ($method,@rpc_params) = @_; 
my (%r,$k); 
my($rpc_call) = XMLRPC::Lite 
-> proxy("http://$opts_host:$opts_port") -> call($method, @rpc_params); 
my $res= $rpc_call->result; 
if (!defined $res){ 
print "Error querying Kamailion"; 
$res=$rpc_call->fault; 
%r=%{$res}; 
foreach $k (sort keys %r) { 
print("t$k: $r{$k}n"); 
} 
exit $ERRORS{'UNKNOWN'}; 
} else { 
return($res); 
} 
}
Monitoring Yate 
• SIP 
• everything else through SNMP 
• SIGTRAN links (beware: element order can change!) 
• Uptime 
• Version 
• Channels 
• …
Monitoring Yate
Monitoring ASR 
• percentage of answered calls / total calls 
• additionally: length of answered calls 
• per Gateway, Carrier, Destination, Product 
• in Yate: configurable and readable via SNMP
Monitoring the rest 
• SIP connectivity to partners 
• Function tests (emergency calls, features) 
• ENUM 
• User Location 
• RTP Proxies 
• STUN 
• iptables Connection Tracking
That’s it 
Downloads available at: http://sipg.at/osmc2014 
And: We hire, too! 
http://www.sipgate.de/jobs/
Thank you!

Monitoring VoIP Systems

  • 1.
    Monitoring VoIP Systems Sebastian Damm damm@sipgate.de
  • 2.
    Who we are,what we do • Düsseldorf based VoIP provider (since 2004) • active in Germany and UK • Private and Business customers • VoIP and Mobile products • some 100k users • almost 100 million minutes each month
  • 3.
    VoIP systems monitored • Asterisk (~100 servers) • Kamailio (12 servers) • Yate (12 servers) • RTP Proxy (12 servers) • ASR (http://en.wikipedia.org/wiki/Answer-seizure_ratio)
  • 4.
    Our Monitoring systems • 2 Icinga servers • almost 1k hosts • more than 5k services • Cacti • Observium for network monitoring
  • 5.
    Monitoring SIP •simple Perl script • UDP capable (TCP and TLS coming soon) • resolves SRV DNS records, checks all targets • Watch the response code! • different systems answer differently
  • 6.
    Monitoring Asterisk •Base monitoring (disk, memory, load) • SIP • Remote Manager • Asterisk Version, Config version, G729 status, Channels, Uptime • Open Files (SNMP extension) • IPtables status (SNMP extension)
  • 7.
  • 8.
    check_manager.pl • firstwe called it separately for each service • high load on the monitoring system • now: one script fills all services • only the MANAGER service is active, all others are passive (with fallback command) • can be called as active check for each service
  • 9.
    check_manager.pl sub push_passive{ my ($service,$state,$msg) = @_; my $timestamp = time; eval { open CMD, ">>", $cmdfile or die $!; }; if ($@) { print "Could not open Command file!n" if (defined($opts_verbose)); return; } my $cmdmsg = sprintf("[%s] PROCESS_SERVICE_CHECK_RESULT;%s;%s;%s;%sn", $timestamp,$opts_host,$service,$ERRORS{$state},$msg); print $cmdmsg if (defined($opts_verbose)); print CMD $cmdmsg; close CMD; }
  • 10.
    Integrating into Icinga • Service Definition Active Check Passive Check define service { service_description MANAGER hostgroup_name sipgw use local-service check_command check_manager } define service { service_description AST_UPTIME hostgroup_name sipgw use local-service check_command check_manager_active!uptime is_volatile 1 active_checks_enabled 0 passive_checks_enabled 1 check_freshness 1 max_check_attempts 1 freshness_threshold 600 }
  • 11.
    Integrating into Icinga • Command Definition Active Check Passive Check (Fallback) define command{ command_name check_manager command_line $USER32$/check_manager -H $HOSTNAME$ -I $HOSTADDRESS$ -u $USER7$ -p $USER8$ -m 127.0.0.1:11211 -s SIPCHAN --g729 } define command{ command_name check_manager_active command_line $USER32$/check_manager -H $HOSTNAME$ -I $HOSTADDRESS$ -u $USER7$ -p $USER8$ -m 127.0.0.1:11211 -a $ARG1$ }
  • 12.
    Check Open Files • consists of two scripts on the monitored system • one script run by cron every minute • other script triggered by SNMPd to read those files
  • 13.
    Monitoring Kamailio •SIP • some variables through XMLRPC calls • Memory • TCP Connections • Version
  • 14.
  • 15.
    XMLRPC in Kamailio 1. Load the module loadmodule "xmlrpc.so" modparam("xmlrpc", "route", "XMLRPC") 2. Handle XMLRPC calls route["XMLRPC"] { if(src_ip == 1.2.3.4) { # only answer to Monitoring set_reply_no_connect(); # optional dispatch_rpc(); } else { xmlrpc_reply("403", "Forbidden"); } }
  • 16.
    Querying Kamailio subcall_rpc { my ($method,@rpc_params) = @_; my (%r,$k); my($rpc_call) = XMLRPC::Lite -> proxy("http://$opts_host:$opts_port") -> call($method, @rpc_params); my $res= $rpc_call->result; if (!defined $res){ print "Error querying Kamailion"; $res=$rpc_call->fault; %r=%{$res}; foreach $k (sort keys %r) { print("t$k: $r{$k}n"); } exit $ERRORS{'UNKNOWN'}; } else { return($res); } }
  • 17.
    Monitoring Yate •SIP • everything else through SNMP • SIGTRAN links (beware: element order can change!) • Uptime • Version • Channels • …
  • 18.
  • 19.
    Monitoring ASR •percentage of answered calls / total calls • additionally: length of answered calls • per Gateway, Carrier, Destination, Product • in Yate: configurable and readable via SNMP
  • 20.
    Monitoring the rest • SIP connectivity to partners • Function tests (emergency calls, features) • ENUM • User Location • RTP Proxies • STUN • iptables Connection Tracking
  • 21.
    That’s it Downloadsavailable at: http://sipg.at/osmc2014 And: We hire, too! http://www.sipgate.de/jobs/
  • 22.