HELSINKI UNIVERSITY OF TECHNOLOGY
Department of Computer Science
Laboratory of Telecommunication Software and Multimedia
SIP over Client Initiated Connections
Master’s Thesis submitted in partial fulﬁllment of the requirements for the degree
of Master of Science in Technology.
Otaniemi, May 1, 2007
Supervisor: Professor Antti Yl¨-J¨¨ski
Instructor: Sasu Tarkoma, Ph.D.
HELSINKI UNIVERSITY OF TECHNOLOGY ABSTRACT OF THE
OF TECHNOLOGY MASTER’S THESIS
Author: Yang Yang
Name of the Thesis: SIP over Client Initiated Connections
Date: May 1, 2007 Number of pages: 46 + 9
Department: Department of Computer Science
Professorship: T-110 Telecommunications Software and Multimedia
Supervisor: Prof. Antti Yl¨-J¨¨ski
Instructor: Sasu Tarkoma, Ph.D.
SIP outbound as an extension of SIP enables the client initiated connections in SIP
signaling system. This feature is desirable in the case of NAT or ﬁrewall present between
the public and the private side. In such situation, connections are only allowed from
the private side to the public side. SIP outbound proposes a mechanism which keeps
the client initiated connections between a UA and proxies and later reuses the same
connections to push data to the UA from the proxy sides. This mechanism ensures the
successful traversal of NAT/ﬁrewall.
In this thesis we implemented SIP outbound protocol as an extension of SIP and in-
tegrated to the WeSAHMI experimental infrastructure and then evaluated the perfor-
mance of the system as a whole.
Keywords: SIP, SIP outbound, STUN keepalive, backoﬀ mechanism, ﬂow token, NAT.
I want to thank my supervisor, Professor Antti Yl¨-J¨¨ski, and instructor Ph.D.
Sasu Tarkoma, for giving me the oppertunity to participant the WeSAHMI project
and instructions to accomplish my thesis.
Many thanks go to Jani Heikkinen and Sergio Lembo for their constructive ideas
and practical helps.
My gratitude also goes to my parents, my husband and my friends for their mental
Otaniemi, May 1, 2007
AOR Address of Record, a well-known address for a user. In SIP, it is a
ALG Application Layer Gateway
API Application Programming Interface
B2BUA Back to Back User Agent
DNS Domain Name System, a global de-centralized directory that trans-
lates domain names into IP addresses.
DNSSRV Domain Name System Service Record Working Group, an IETF
working group that speciﬁed a DNS extension enabling ﬁnding of
an IP address of a service based on a protocol and domain.
DHCP Dynamic Host Conﬁguration Protocol, and Internet protocol for
automating the conﬁguration of devices using TCP/IP.
DTLS Datagram Transport Layer Security
EP Edge Proxy, any proxy that is located topologically between the
registering User Agent and the Authoritative Proxy.
HTTP Hyper Text Transport Protocol, a web browsing protocol.
HMAC Hash message Authentication Code, is a type of message authenti-
cation code calculated using a cryptographic hash function in com-
bination with a secret key.
ICE Interactive Connectivity Establishment
IETF Internet Engineering Task Force
IP Internet Protocol
NAT Network Address Translation, enables a local are network to use one
set of IP addresses for internal traﬃc and a second set of addresses
for external traﬃc.
NTP Network Time Protocol, a protocol for synchronizing the clocks of
computer systems data networks.
SDP Session Description Protocol: A format for describing the types of
media to use in a session.
SHA-1 Secure Hash Algorithm Version 1.0, a standard for computing a
condensed representation of data.
SIPCOMP Signaling compression: A framework used to compress signaling
message using arbitrary compression algorithms.
SIP Session Initiation Protocol
SIP URI A uniform resource identiﬁer with the scheme ”sip:”. SIP systems
use the domain component along with DNS to determine where to
send SIP messages.
SMTP Simple Mail Transport Protocol, a protocol for email
SSL Secure Socket Layer, a predecessor of TLS.
STUN Simple Traversal Underneath Network Address Translation
TCP Transmission Control Protocol, an Internet protocol that estab-
lishes reliable connections over IP.
TLS Transport Layer Security
UAC User Agent Client
UDP User Datagram Protocol, a connectionless Internet protocol run-
ning on top of IP.
UMTS Universal Mobile Telecommunications System,
URL Uniform Resource Locators, names used to represent addresses or
locations in the Internet.
UUID Universally Unique Identiﬁer.
WeSAHMI Web Services in Ad-Hoc and Mobile Infrastructure.
List of Tables
4.1 Updated binding behaviour in SIP outbound . . . . . . . . . . . . . 20
5.1 Registration behavoir . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 STUN attributes supported by the implementation . . . . . . . . . . 27
6.1 REGISTER request proxied to the primary EP . . . . . . . . . . . . 31
6.2 REGISTER request proxied to the secondary EP . . . . . . . . . . . 32
6.3 200OK response received by the UA from the primary EP . . . . . . 33
6.4 200OK response received by the UA from the secondary EP . . . . . 34
6.5 SUBSCRIBE request sent by the UA to its Notiﬁer. . . . . . . . . . 35
6.6 NOTIFY request sent from the Notiﬁer to the UA. . . . . . . . . . . 36
6.7 STUN Binding request. . . . . . . . . . . . . . . . . . . . . . . . . . 36
A.1 Modiﬁcation to eXosip and osip libraries . . . . . . . . . . . . . . . . 45
The increase of the Internet usages result in the assimilation of telephony services
into the Internet Protocol  technology, which stimulates the generation of signaling
protocols to set up and tear down multimedia sessions. Some communities propose
solutions in accordance with their own priorities and interests. Session Initiation
Protocol (SIP), born in a computer science laboratory within a decade, satisﬁes the
growing thirst for a new generation of IP based services .
The SIP is a signaling protocol used for establishing sessions in an IP network. It
is developed by IETF as part of the Internet Multimedia Conferencing Architecture
. It incorporates elements of two well-known protocols: the Web’s Hyper Text
Transfer Protocol (HTTP)formatting protocol  and the Simple Mail Transfer
Protocol (SMTP) e-mail protocol  . Its ﬁrst major use has been signaling in
Internet telephony . But gradually SIP’s utility does not end with telephony: it
is already employed as a basic technology for instance messaging and presence.
SIP resolves two signiﬁcant issues in establishing these real time communication
sessions. First of all, it helps participants going to communicate locate each other
on the Internet (rendezvous). Then it allows those participants to negotiate how
they are willing to communicate.
Nowadays, more and more carriers and providers oﬀer SIP-based services such as
local and long distance telephony, presence and instant messaging, voice message,
push-to-talk, rich media conferencing, and so on. All these media communications
resort to SIP as a signaling protocol, since SIP allows proxy servers to to initiate
TCP connections and send asynchronous UDP datagram to User Agents (UAs).
SIP will be used as the primary signaling technology in the next generation mobile
However, because of the presence of Network Address Translators (NATs) and ﬁre-
CHAPTER 1. INTRODUCTION 2
walls, network is segmented, which causes SIP servers, such as registrars or proxies,
can not initiate connections to UAs. A ﬁrewall device will block connections to the
UA between the UA and the proxy servers. Similarly NATs only allow connections
from the private address side to the public side.
Researches about the eﬀect of NAT have been done in recent years. Several
extensions are proposed to the original SIP speciﬁcation , which allows a UA to
receive incoming signaling requests from the server side.
1.1 Research problem
The SIP enables the end systems and proxy servers to establish multimedia sessions
with each other. However, according to the above discussion, only connections to
the server initiated by the UA can be established, but connections in the reverse
direction, server initiated connections, are not possible. It is because a SIP endpoint
behind a NAT only sends messages with its private address and unmapped port,
which will be useless to other endpoints not behind the same NAT. Moreover, most
NATs/ﬁrewall prevent incoming TCP connections and UDP traﬃc from the public
side. This drawback of the NATs impedes the end-to-end connectivity of SIP. A
SIP endpoint will not work in such situation, without implementation of external
extensions of SIP.
The above problem can be partially resolved by deploying an Application Layer
Gateway (ALG)  inside the NAT. A SIP-aware ALG can inspect the message,
and map the internal addresses and ports to outside addresses and ports. But this
always requires the ALG to know the nuances of a new use of SIP. Since SIP is a
framework protocol instead of a single application, this method can not be a cure-all
A improved version of the same idea is to put a pair of UAs back to back across
that NAT/ﬁrewall point. The a pair of UAs is known as Back to Back User Agent
or session border controller . The B2BUA acts as a UA server on one side and
as a UA client on the other side, terminating and re-originating signaling and media
on both sides. However, a B2BUA has to learn any new protocol features before
allowing them to pass.
To make the endpoints to traverse NAT easier, the Simple Traversal Underneath
NATs (STUN)  was proposed years ago. Through the STUN protocol, a SIP
UA can detect the mapping o f its IP address and port on a NAT device between
the private side to the public side. But the addresses obtained may not be usable
by all peers. So only STUN itself can not solve the NAT traversal problem. An
CHAPTER 1. INTRODUCTION 3
extension of STUN, known as the Traversal Using Relays around NAT (TURN),
allows a SIP client behind a NAT/ﬁrewall to receive incoming data over TCP or
UDP connections . However, it only supports the connection of a client behind a
NAT to a single peer. And the cost of providing a TURN relay server is so high that
the TURN would only be desirable as a last resort. The Interactive Connectivity
Establishment (ICE) methodology  can be used to discover optimal means of
connectivity using various techniques, such as STUN and TURN .
In the worst case, a SIP client may ﬁnd itself behind a NAT/ﬁrewall that prevents
all incoming traﬃc except packets of a TCP stream the client opened. The SIP
outbound extension is proposed , which reuses the connection initiated by the
UA to the EP after the UA establishes a connection to the EP successfully by
sending REGISTER requests. Since the server can not reach the UA, it is the UA’s
duty to keep the connection active. When a UA initiates a connection to the proxy,
the proxy can later reuse this ﬂow to push SIP message to the UA. So the UA has
to assure the ﬂow is always active.
This thesis represented the SIP outbound extension based the internet draft 
and did reasonable experiment and evaluation against the new features to inspect
the complexity and usability of SIP outbound. Some updates were proposed to the
speciﬁcation for implementation needs.
1.2 Brief motivation
This thesis was carried out in the WeSAHMI project. In WeSAHMI project, an
experimental infrastructure for interactive wireless applications, that can operate in
an ad-hoc networking environment, is implemented. In addition, a demo application
suite for an airport environment is to be implemented . SIP is employed as the
communication protocol in the session level in IP networks by the WeSAHMI secu-
rity architecture. After the upper layer accomplishes identiﬁcation for all entities,
the client system starts a secure session with a gateway. This thesis implemented
the SIP outbound protocol as an extension of SIP. So with the extended features
addressed in , the client can initiate a secure channel to open ports for the client
in the gateway (namely EP in the following chapters). After the secure channel has
been established, the channel is kept active by the client. So the gateway later can
push SIP messages to the client.
CHAPTER 1. INTRODUCTION 4
1.3 Structure of the thesis
Chapter 1 introduces the general background knowledge and presents the research
problem. Chapter 2 addresses the eﬀects of combining NAT and ﬁrewall with SIP
signaling and background information of WeSAHMI project. Chapter 3 introduces
the system model of WeSAHMI project and how SIP outbound ﬁts to the whole
WeSAHMI architecture. In chapter 4 we present SIP outbound protocol in more
details, and discuss its challenges in the view of implementation practices. Chapter
5 reviews the procedure of our SIP outbound implementation, and how we integrated
STUN protocol to SIP. Chapter 6 experiments the implementation in a simpliﬁed
system against the required the features in SIP outbound. Chapter 7 discusses the
performance of the system after extended by SIP outbound and other possibilities for
the ﬂow token algorithms. Chapter 8 concludes the thesis and presents conclusions
and future works.
Originally, NAT devices are used to connect an isolated address to an external realm
with globally unique registered addresses . So it eﬀectively extends the address
space. Because SIP packets go out from a NATed client with their private IP ad-
dresses packed into the message headers (Via and Contact headers) and SDP bodies
, a NAT device are not aware of them. So when the packets get to their destina-
tion, they are processed and responded to completely useless source addresses.
The eﬀect of NAT and ﬁrewalls to signaling system become active research topic
  . Several solutions were proposed to allow SIP to traverse NAT and ﬁre-
wall eﬀectively , . Solutions to this include using TCP for SIP instead of UDP,
employing keep alive program to maintain NAT bindings, or using STUN/TURN
The key to successful NAT/ﬁrewall traversal is that the remote host know which
global port and IP address has been assigned by the NAT for a given ﬂow. The
extension of SIP, called ICE  relies on two new protocols being developed in the
IETF, STUN and TURN. STUN allows a host to learn the global IP address and
UDP port assigned by its outermost NAT box. The address can be subsequently
conveyed by SIP to allow direct UDP connectivity between hosts. TURN allows a
host to select a globally-addressable TCP relay, which can subsequently be used to
bridge a TCP connection between two NATed hosts. Unlike STUN, TURN does
not allow direct connectivity between NATed hosts.
Diﬀerent from the ICE extension, SIP outbound inserts an extra network entity,
edge proxy, to traverse NAT and ﬁrewall, with a client-initiated connection mecha-
nism. The SIP client initiates secured connections to EPs (at least two) by sending
REGISTER requests. These secured connections will be maintained by the client
and EPs so that later EPs can push data to the client through these connections.
CHAPTER 2. BACKGROUND 6
This feature requires the EP to work not only as a SIP proxy but also as a keep
alive server. And the EP has to be able to distinguish diﬀerent connections initiated
by diﬀerent clients. The EP identiﬁes diﬀerent connections by assigning diﬀerent
ﬂow tokens for each connection. Communications to untrusted external domains
are allocated to EPs since clients are invisible to outer domain. Failure tolerance
mechanism is also considered in  by proposing multiple registrations and multiple
physical hosts deployment.
As part of the security model of WeSAHMI architecture, this thesis represented
the implementation of SIP outbound as an extension of SIP. The WeSAHMI project
implemented an application for an airport environment. In the airport scenario, a
crucial matter is the delivery of real-time information updates to the passengers and
employees of the airport. Such kind of information updates include ﬂights’ delay or
cancellation, the changes of departure gates of the ﬂights and such. The time delay
caused by the process of information delivery is also crucial. The airline information
system would push the information of the updated situation to passengers on time.
In the WeSASHMI project, two principal services for communication are required
between the Finnair application server and the passengers: pull and push services.
Both of these services are carried out through SIP.
SIP enables clients to register to certain services. Once registered, clients can
pull information from the content server, and the server can send asynchronous
notiﬁcations to the client. As shown in the left side of Figure 2.1, the client sends
a SUBSCRIBE message, which is acknowledged by the notiﬁer with a NOTIFY
message. This is the push service.
The pull-service is similar. The client has to know what content to pull from the
notiﬁer. The notiﬁer can send descriptions of available content by using push service.
Once the client knows what services are available, it can decide what content to pull
from the notiﬁer. As shown in right side of Figure 2.1, the notiﬁer ﬁrst sends a
NOTIFY message which carries a description of the available services. Later, the
client sends a SUBSCRIBE request to query the service, which is acknowledged by
a NOTIFY with the real data of the service.
The security architecture of WeSAHMI system is used to establish authentication
and authorization between clients and the WeSAHMI server. SIP outbound proposes
an additional networking element (Edge Proxy) consisting of transport and security
mechanisms. The EP will be inserted between the UA and the notiﬁer topologically.
So the procedure pull services above has to be adjusted as shown in Figure ??. The
push service is similar, so it is not illustrated in the ﬁgure. All incoming and outgoing
messages have to be forwarded to the EP.
CHAPTER 2. BACKGROUND 7
Figure 2.1: Data push and pull service
Figure 2.2: Data pull service with a edge proxy
3.1 WeSAHMI architecture
In , an experimental infrastructure is speciﬁed for interactive wireless applications
operating in a mobile ad-hoc networking  environment. A practical application
is deployed for an airport environment. The system provides mobile check-in service
for passengers in the airport. The user of the system is entitled to take necessary
actions with her or his mobile device, such as check-in, registration for a ﬂight,
baggage drop and security gate.
To support the above functions, the infrastructure must be characterized by iden-
tiﬁcation of mobile users and tracking of their presence, delivery of content, notiﬁca-
tions, and status updates to mobile users in a server-initiated fashion, and managing
and updating the state of both clients and servers in real time.
The WeSAHMI architecture consists of the following components:
WeSAHMI server: a central role as relaying data from the external model to
client browser: a X-smile browser on a client node,
security architecture is used to establish secure channel between clients and
WWW server: an Apache WWW server to host user interface components
and relay client input to the WeSAHMI server.
CHAPTER 3. SYSTEM MODEL 9
3.2 WeSAHMI security architecture
Our implementation hosts in the security architecture. The security architecture is
designed to push data from the trusted WeSAHMI environment to untrusted wireless
network environment. An extra network entity (namely edge proxy) is added to the
architecture to ensure secure data delivery push. The edge proxy is equipped with
transport and security mechanisms. The edge proxy is a logical entity. Physically,
we can deploy multiple hosts to decrease the possibility of lost notiﬁcation caused
by a single element failure.
Other elements included in the architecture are mobile hosts and notiﬁcation
service. The mobile host, working as a SIP UA, can initiate a connection to the
EP by sending REGISTER request to the registrar. And then the registrar will
challenge the mobile host for authentication. After successful registration indicated
by receiving 200 OK response, the mobile host sends STUN Binding requests over
the same ﬂow for sending SIP messages to keep the ﬂow active. This established and
ongoing ﬂow will later be used for secure push. The notiﬁcation service also works
like a SIP UA. It fetches the contact address of the mobile host by querying the
registrar. The NOTIFY request is forwarded to the EP and then the EP forwards
it to the mobile host through the existing connection initiated by the mobile host.
3.3 SIP outbound
SIP is used to provide pull- and push- services to the WeSAHMI system. For exam-
ple, a client can register to certain services, and then pull data to the service provider
or receive asynchronous notiﬁcations from the service provider. But because of the
NAT and ﬁrewalls presence, the connections from the server side to the clients side
become impossible. That is, the service provider can not deliver asynchronous data
to clients, which is an expected feature for the WeSAHMI system. To solve this
problem, we have to add new features to the basic SIP according to one of the SIP
extensions, that is SIP outbound .
We insert an extra entity to the security architecture, namely the EP. So any
clients who want to subscribe to certain service, must ﬁrst establish a direct ﬂow
to their EPs by sending REGISTER requests. A local daemon on the client takes
charge of the registration and also handles the SUBSCRIBE/NOTIFY messages.
After successful registrations, the daemon may send a SUBSCRIBE message to a
content server forwarded through one of its outbound EPs, to which the content
server acknowledges with a NOTIFY. On the other hand, if a message from the
CHAPTER 3. SYSTEM MODEL 10
content server has arrived, the daemon will deliver the message to the client appli-
cation, such as the browser. Figure 3.1 shows where we deploy the SIP outbound
component in the WeSAHMI architecture.
Figure 3.1: Deployment of SIP outbound in WeSAHMI security architecture
The client daemon uses keep alive mechanism to keep the ﬂow to its outbound
EPs always active. So when the content server wants to push messages to clients,
it can always reach the client from the public side through a secured channel.
SIP client-initiated outbound
This chapter brieﬂy describes SIP outbound extension. We adjusted the structure
of the SIP outbound draft , and organized it to be convenient for implementation.
4.1 Overview of the mechanism
SIP outbound is speciﬁed to be applied to the environment in which a registrar, or
more general a proxy server, can not initiate direct connections to the UA behind a
NAT box or ﬁrewall. So the key idea of SIP outbound is that when a UA initiates a
connection to a proxy server, the proxy server can later reuse the same connection
to forward requests to the UA. Certainly, the UA must ensure the connection active
by using certain keep alive mechanism.
To achieve high reliability of connections, the UA can form multiple ﬂows to the
proxy server (known as EP in SIP outbound) by registering multiple times over
diﬀerent connections for the same SIP AOR. Each REGISTER request includes an
instance-id (used to identify the UA uniquely) and a reg-id label (to distinguish
diﬀerent ﬂows). And each ﬂow is kept active by using STUN keep alive mechanism
over UDP connection or TCP keep alive.
In the following sections, we will introduce more speciﬁcally about diﬀerent be-
haviors of four networking entities (UA, EP, registrar and authoritative proxy),
supporting SIP outbound features.
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 12
4.2 User agent behavior
4.2.1 Flow establishment
At conﬁguration time UAs obtain a set of SIP URIs representing the default out-
bound proxy set. In , the conﬁguration mechanism is excluded. However, this
should also be a key point for the implementation. For more implementation details,
please refer to chapter 5 and 7. The number of URIs in this set should be at least
two and no more than four. For each outbound proxy URI in the set, the UA must
send a REGISTER request to form a direct ﬂow to the EP. The EP forwards the
request to the registrar, and then every thing works as normal SIP: the registrar
may challenge the UA for authentication; the UA sends its credential and waits for
the 200OK response from the registrar which indicates a successful registration.
The UAC is required to support the Path header mechanism, by including the
’path’ option-tag in a Supported header ﬁeld value in its REGISTER requests.
Successful registrations are indicated by the presence of ’outbound’ option-tags in
Supported header ﬁeld values in responses, which reveals the registrar and all EPs
traversed by the UAC support SIP outbound extension.
The failure of a registration is indicated by the UA’s receiving 503 (Service Un-
available) responses with a Retry-After header ﬁeld. So the UA needs to recover
the ﬂow by employing backoﬀ mechanism to decide the time for re-registration. De-
tails about ﬂow recovery can be found in section 4.2.2 the paragraph about backoﬀ
Instance ID and Register ID
SIP outbound  introduces two new parameters for the Contact header ﬁeld: In-
stance Identiﬁer (instance-id) and Registration Identiﬁer (reg-id). In a signaling
system supporting SIP outbound, each UA is identiﬁed uniquely by a persistent
instance-id URN. This instance-id must be persistent even if the UA reboots or
power cycled, and must not change as the device moves from one network to an-
other. The UA uses a UUID URN  as its instance-id and attaches it to the
Contact header ﬁeld as a ”+sip.instance” media feature tag.
The UUID URN does not require central registration process so no centralized
authority is required to administer them. In our mobile wireless environment, this is
a favorable feature to minimize additional entities. Furthermore, a UUID is a ﬁxed
size of 128 bits URN which is reasonably small compared to other alternatives. And
the unique ability to generate a new UUID without a registration process allows for
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 13
UUIDs to be one of the URNs with the lowest minting cost.
Another new Contact header ﬁeld parameter is reg-id, added by the UA. The
UA uses reg-id to distinguish diﬀerent ﬂows, since it can register multiple times
over diﬀerent connections for the same SIP AOR. The reg-id does not have to be
incremented sequentially, but it has to be unique for each ﬂow. And when the UA
power cycles or reboots the reg-id has to remain the same as the previous ﬂow’s so
that the registrar can replace the older registration.
4.2.2 Flow recovery
An ongoing ﬂow may fail because of various network problems. So the UA should
be able to detects failures by certain mechanisms, such as keepalive mechanisms. If
a ﬂow fails, the UA uses the procedure described in section 4.2.1 to form a new ﬂow
to replace the failed one. However, before the recovery of the ﬂow, the UA should
wait for some time as described in the following paragraph.
The UA employs backoﬀ mechanism to avoid avalanche restart on EPs. That is, the
UA needs to wait amount of time before trying to establish a new ﬂow to replace
the failed one.
The following algorithm is used to calculate the waiting time in seconds:
T IM Ewait = min(T IM Emax , (T IM Ebase × (2f ailures )))
T IM Emax : the default value is set to 1800 seconds.
failures: is the number of consecutive registration failure.
T IM Ebase : is set to 30 seconds if all of the ﬂows to every URI in the outbound
proxy set have failed; otherwise, if at least one of the ﬂows has not failed, it
is set to 90 seconds.
A ﬂow is considered successful if outbound registration succeeded and keepalives
have not expired for min-regtime seconds (default of 120 seconds) after a registration.
The time to re-register, known as delay time, is computed by selecting a uniform
random time between 50 and 100 percent of the T IM Ewait . The UA must wait
for the value of the delay time before re-registration. The default ﬂow registration
backoﬀ time table can be found in the Appendix A in .
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 14
4.2.3 Keepalive mechanism
Two keepalive methods are proposed: STUN over UDP and TCP keepalive. For
SIP over UDP, a limited version of STUN  keepalive mechanism is employed.
The only STUN messages required by this usage are Binding Requests, Binding
Responses, and Error Responses.
The UAC sends STUN messages over the same UDP ﬂow used for sending SIP
messages. On the server (EP or registrar) side, it must also provide a limited version
of a STUN server listening on the same network interface and port as the SIP proxy
The UA needs two phases of validation for STUN keepalive support. The ﬁrst
phase allows a UA to inspect if the URIs in its outbound proxy set containing the
’keep-stun’ parameter, or not. In most circumstances, this explicit indication should
be suﬃcient. But misconﬁguration may happen sometimes. If sending binary STUN
data to a proxy that does not support STUN, the node could be blacklisted for UDP
traﬃc. So we need the second phase of validation, namely an explicit probe. A UA
can send an OPTION request to the next hop by setting the Max-Forwards header
ﬁeld to 0, and expect that the next hop responses with the ’sip-stun’ option tag in
its Supported header ﬁeld. Otherwise, if either of these two validation phases fails,
the UA must stop sending additional STUN messages.
The UA can perform explicit probe just after it establishes a direct ﬂow to the
EP as shown in Figure 4.1, or probe STUN support after it sends a STUN Binding
Request and does not receive a STUN success response as shown in Figure 4.2. The
order of these two phases of validation is implementation speciﬁc issue, and is left
for the implementor to decide.
For SIP over TCP or SIP over TLS over TCP, TCP keepalive is suﬃcient to remain
the ﬂow active. Some operating system, such as Linux, supports per connection TCP
keepalive, which facilitates the keepalive support.
4.3 Edge proxy behavior
The Edge Proxy is located topologically between the UA and the AP and works
as a stateless forwarding proxy. It receives SIP requests and then forwards these
requests to the next hop (a registrar, another EP, or a UA). And if it wishes to be
revisited for any subsequent requests, it will add itself to the Path vector . As
we expect, the EP should be able to use the ongoing ﬂow to forward. To achieve
this feature, it will insert an identiﬁer–containing information about the ﬂow from
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 16
Figure 4.2: Explicit probe after no success STUN response received
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 17
the previous hop–in its Path URI.
4.3.1 Flow token
When the EP receives a REGISTER request from a UA, it needs to create an
identiﬁer value that uniquely identiﬁes this ﬂow, and add this identiﬁer to its user
part of Path URI. The identiﬁer allows the EP to map future requests back to the
correct ﬂow. Moreover, an indirect examination of user’s authentication is done by
checking the presence of the identiﬁer returned by a successful registration response.
SIP outbound  proposes ﬂow token as a ﬂow identiﬁer, and also two algorithms
for stateless ﬂow token mechanisms. For the sake of security, in our implementa-
tion we used algorithm 2 proposed in SIP outbound, but modiﬁed its input S by
replacing local IP and port with the ﬁle descriptor, and then encode it with base64
In SIP outbound the ﬁrst algorithm generates a 16 octets long token. The
equation 4.1 is for a TCP connection. NTP is the time the connection is created
. The equation 4.2 is for a UDP based transport, so no NTP time is needed, but
the remote IP and port are required .
T oken = BASE64encode (f ileDescriptor||N T P ) (4.1)
T oken = BASE64encode (f ileDescriptor||remoteIP ||remoteP ort) (4.2)
This algorithm itself has no security assurance, so an attacker can hijack another
user’s call without a hitch. Unless, we employ SIP level security protection, this
algorithm must not be used. But security mechanism in SIP level is expensive. So
we preferred the second algorithm.
T oken = BASE64encode (HM ACSHA1−80 (K, S)||S) (4.3)
In equation 4.3, K is a 20-octet crypto random key distributed (can be obtained
from a trusted third party) and shared among EPs. The input S is formated as
shown in the following Figure 4.3. We used HMAC-SHA1-80  to compute the
keyed-hash value of S, and then encoded the concatenation of the HMAC of S and
S by using base64 encoding . This will result in a 32-octet identiﬁer.
In our implementation, we used algorithm 2, but replaced the local IP and local
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 18
Figure 4.3: The format of S
port ﬁelds of S with the ﬁle descriptor of the socket.
4.3.2 Forwarding Mechanism
There are two kinds of requests traversing the EP. One kind of requests is an inter-
mediate request which is generated by a UA in another domain and has no direct
ﬂow to the EP. Another kind is that EP can receive requests from a UA or another
EP, depending on the conﬁguration. As an intermediate proxy receiving a request
from another EP and it is the host in the topmost Route header ﬁeld value, the
proxy compares the ﬂow in the ﬂow token with the source of the request. If these
refer to the same ﬂow, the EP removes the Router header and continues processing
the request. If the ﬂow token is invalid, the EP has to reject the request.
Figure 4.4 shows a concrete example. The solid bi-directional arrowed lines indi-
cate direct ﬂow between entities. The dash lines mean ﬂows established when being
needed. UA1 in domain 1 wants to contact UA2 (any kind of SIP request), ﬁrst
UA1 refers to its registrar and get the contact information of UA2 and also the
ﬂow token for the Path header . Then it proxies its request to EP1 which has a
direct ﬂow to it. EP1 ﬁnds itself is the topmost host in the Route header, and the
Route header contains a ﬂow token, so EP1 check if it is a valid ﬂow token. If so,
it applies normal routing procedure to decide the next hop. We assume that EP1’s
next hop is EP2, so it routes the request to EP2. When it receives the request, the
EP2 checks if the request contains a valid ﬂow token and if the ﬂow token is created
by itself. In this example EP2 notices the destination is UA2 who has a direct ﬂow
to it. So EP2 sends the request to UA2 through the direct ﬂow.
EP1 and EP2 proceed the ﬂow token according to the algorithm they use to
generate the token: If they use algorithm 1: They ﬁrst decode the user part of
the Route header using base64. Then for a TCP-based transport, if a connection
speciﬁed by the ﬁle descriptor matches its creation time, they forward the request
over that connection. For a UDP-based transport, they forward the request from
the encoded ﬁle.
If they use algorithm 2: Equivalently they decode the ﬂow token. Then they
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 19
Figure 4.4: Forwarding mechanism of EPs
verify if the HMAC is correct by recomputing the HMAC and checking if they match
each other. If the HMACs mismatch, EPs should send a 403 (Forbidden) response.
Otherwise, they should forward the request on the ﬂow that was speciﬁed by the
information in the ﬂow identiﬁer. To ensure the mid-dialog requests are routed over
the existing ﬂow,  proposes the EP adds a Record-Route entry to each dialog
initiating request. The Record-Route contains a SIP URI which is comprised of a
ﬂow token and a domain name. If this ﬂow no longer exists, the EP should send a
430 (Flow Failed) response to the request side.
4.3.3 Keepalive mechanisms
Meanwhile, the EP must also support keepalive mechanisms and function as a STUN
server for UDP connections or TCP keepalive as presented in section 4.2.3.
4.4 Registrar behavior
As described in the SIP speciﬁcation , a SIP client sends REGISTER request
periodically to a server (known as a SIP registrar) to associate the client’s SIP or
SIPS URI with the machine into which the client is currently logged (conveyed as a
SIP or SIPS URI in the Contact header ﬁeld). The registrar writes this association,
also called a binding, to a database, called the location service. REGISTER request
can add a new binding between an AOR and one or more contact addresses. A
client can also remove previous bindings or query to determine which bindings are
currently in place for an AOR.
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 20
SIP outbound updates the deﬁnition of a binding in . The updated binding
behavior is shown in the following table 5.1, according to the presence of instance-id
instance-id reg-id Binding Behaviour
Registrar * * Bind an AOR with the combination of
* instance-id and reg-id
* Invalide reg-id to be ignored
Normal binding behaviour
Table 4.1: Updated binding behaviour in SIP outbound
According to the table 5.1, a Contact header ﬁeld value with an instance-id but
no reg-id is still valid. But this is not applied to the reverse situation which only has
a reg-id but no instance-id. So the reg-id parameter will be simply ignored when the
instance-id is not present. Moreover, the registrar must also be prepared to receive,
for the same AOR, some registrations that use instance-id and reg-id and some do
not. This implies the registrar has to work as a normal SIP registrar and a registrar
supporting SIP outbound when needed.
The registrar must store all the Contact header ﬁeld information, and store the
time at which the binding was last updated. If a Path header ﬁeld is present, the
registrar stores this information as well. If the registrar receives a re-registration, it
must update any information that uniquely identiﬁes the network ﬂow over which
the request arrived, and should update the time the binding was last updated.
The registrar must include the ’outbound’ option-tag in a Supported header ﬁeld
value in its responses to REGISTER requests for which it has performed outbound
processing. This explicitly informs EPs and UAs that this registrar supports SIP
4.5 Authoritative proxy behavior
The AP entity is present when location service is needed by the UA. The location
service contains information that allows a proxy to input a URI and receive a set of
zero or more URIs that tell the proxy where to send the request . This information
is created by registrations. As shown in Figure 4.4, UA1 looks up a registration
binding to get the contact information of UA2 by using the location service provided
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 21
by the AP and then sends a request through EP1. An AP selects a contact to use
normally, with a few additional rules:
The proxy must not populate the target set with more than one contact with
the same AOR and instance-id at a time. If a request for a particular AOR and
instance-id fails with a 430 (Flow Failed) response, the proxy should replace
the failed branch with another target (if one is available) with the same AOR
and instance-id, but a diﬀerent reg-id.
If the proxy receives a ﬁnal response from a branch other than a 408 (Request
Timeout) or a 430 (Flow Failed) response, the proxy must not forward the
same request to another target representing the same AOR and instance-id.
The targeted instance has already provided its response.
In this chapter, we will describe how the SIP outbound  was implemented as an
extension of the existing SIP framework. And how our implementation integrated
to the WeSAHMI architecture.
5.1 Open source libraries
We used the open source SIP libraries eXoSIP and oSIP to build the basic SIP
application routine. To minimize changes to the original libraries’ interfaces, we
extended most SIP outbound features in application level. That is, all SIP out-
bound features, except for keepalive mechanisms such as STUN and TCP keepalive,
were implemented by calling APIs provided by eXosip library. The eXosip2 is an
extension of the oSIP library which is a low level SIP library implementing SIP
transactions. The oSIP library provides SIP message parsing and wrappers. The
eXosip sends and receives SIP messages in isolation, and creates a separate thread
for the SIP application built upon the eXosip2 and oSIP libraries. A transaction
state machine of the oSIP library calls callback functions to send SIP messages. A
listening socket needs to be initialized in another thread to receive incoming SIP
messages in the application program. The eXosip2 provides the implementation of
the callback functions for sending the outgoing SIP request over a network trans-
port. In order to reuse the already established TCP connections, the eXosip2 looks
up a data structure which stores all the previous active UDP or TCP connections.
The STUN keepalive mechanism and ﬂow token algorithm was implemented in
separate ﬁles. Please refer to appendix A for important modiﬁcations and data
structures. Other open source libraries including openSSL, uuid and base64, were
also used to facilitate our implementation. OpenSSL is a cryptography implemen-
CHAPTER 5. IMPLEMENTATION 23
tation of the SSL and TLS  and the DTLS  protocols. We used APIs provided
by OpenSSL to construct HMAC for the ﬂow token.
5.2 User agent routine
First the UA daemon, or called client daemon, initiated the eXosip library, which
constructs some important data structure. Then it registers to the registrar by
forwarding the two REGISTER requests to its primary and secondary EPs respec-
tively. The registrar may challenge the UA. So the UA should provide its identity
as its credential. Successful registrations are indicated by the UA receiving 200OK
responses. This ﬁnally leads to the establishment of two direct ﬂows between the
UA and its EPs. Nevertheless, if either of these two ﬂows failed, such as any situa-
tion (as described in section 5.2.2) occurred, the UA should use backoﬀ mechanism
After these initial ﬂow establishment, a timer is trigged, and the the UA can
start normal SIP traﬃc. We assume it sends a SUBSCRIBE to a remote service
provider. So ﬁrst the UA should consult the registrar to get the contact information
of the service provider. Then it proxies the request to any of its two proxies using
an established ﬂow. There is no preference which EPs should be used ﬁrst. In our
implementation, we always pick the primary EP to proxy requests. More intelligent
mechanism is discussed in chapter 7. When the timer expired, keepalive messages
were sent. For SIP over UDP, STUN binding requests were sent (refer to section 5.4
for details); for TCP or TLS over TCP, Linux kernel used TCP keepalive to keep
the ﬂow active.
5.2.1 Termination of a ﬂow
Our system should be able to terminate a ﬂow elegantly. Once the user wants to
terminate SIP communication, he or she can send a REGISTER request with 0
value in Expire header ﬁeld. The registrar removes the binding so that no further
requests will be sent to the user’s UA.
Depending on the presence of the Contact and Expires headers  in the REG-
ISTER request, the registrar will take diﬀerent actions as shown in Table 5.1.
The REGISTER request may contain an expires parameter in the Contact header
or an Expires header ﬁeld. According to , the REGISTER request with a wild
card Contact header ﬁeld must only be used with the Expires header whose value
is 0 to remove all registrations. The expires parameter in the Contact header is
CHAPTER 5. IMPLEMENTATION 24
Request headers Registration behavior
Contact:* Cancel all registrations
Contact:sip:email@example.com; Add URL to current registrations;
expires=30 registration expires in 30 minutes
Table 5.1: Registration behavoir
optional and only indicates the desired expiration time of the registration. If it is
absent, the Contact header uses the Expires header as the default value.
5.2.2 Failures of a ﬂow
Taking the STUN keepalive and implementation practices, we categorize the situa-
tions of a ﬂow failure as follows:
503 (Service Unavailable) response;
XOR-MAPPED-ADDRESS attribute changes in the STUN Binding Response;
408 (Request Timeout)response to a next-hop OPTIONS probe for STUN
430 (Flow Failed) response;
any transport layer failure, such as a fatal ICMP error;
failure of a STUN request, such as STUN retransmission.
If any of the above situation occurs, that is a UA receives any of the above
messages, the UA considers that this ﬂow is failed. So it clears up this ﬂow, and
waits for the right time to re-register by using the backoﬀ mechanism.
We implemented the backoﬀ mechanism described in section 4.2.2. So before the
UA registers again, it has to wait for certain amount of time. The UA has to use
the same reg-id as its previous ﬂow. So the registrar knows this is a new ﬂow to
replace the old one.
CHAPTER 5. IMPLEMENTATION 25
5.3 TCP keepalive
For SIP over TCP, or SIP over TLS over TCP, we use TCP keepalive. Linux
kernel supports per-connection TCP keepalive. But by default, TCP keepalive is
disabled. We enabled its support by setting TCP socket options to SOL SOCKET
and SO KEEPALIVE . This feature is integrated to the eXosip library. Namely,
when the UA routine program called eXosip listen addr using TCP protocol, the
eXosip creates a TCP socket which enables keepalive mechanism. Besides, we still
need conﬁgure three TCP keepalive parameters:
/pro/sys/net/ipv4/tcp keepalive time: the number of seconds the keepalive
routines wait for before sending the ﬁrst keepalive probe;
/pro/sys/net/ipv4/tcp keepalive intvl: the time interval between keepalive mes-
sages after the ﬁrst prob;
/pro/sys/net/ipv4/tcp keepalive probes: the number of consecutive probes be-
fore the connection is marked as broken.
Many other alternative methods can also be used to modify the parameters. We
just picked the one convenient for you.
5.4 STUN keepalive over UDP
5.4.1 Overview of the mechanism
Before addressing more technical details, we must clarify one point may appear
confusing later. STUN support is relatively independent to SIP outbound. SIP
outbound requires STUN support, but any UA or proxy supports STUN, does not
necessarily need to support SIP outbound. So STUN or more general keepalive
mechanism can be perceived as an extension of SIP. This is one reason why STUN
keepalive was integrated in eXosip as independent ﬁles.
As speciﬁed in , we implemented a limited version of STUN client and server on
the SIP UA and the SIP EP respectively. Only STUN Binding Requests, Binding
Responses, and Error Responses are needed.
The UA must generate STUN keepalive messages towards the EP to refresh the
binding on NAT before it expires. Rather than using expensive application layer
messages such as SIP message, the UA sends a STUN binding request to the EP to
exact the same transport address used for SIP, such as port 5060 or 5061. This has
CHAPTER 5. IMPLEMENTATION 26
the eﬀect of keeping the bindings in the NAT alive. The STUN binding responses
inform the UA that the EP is still responsive, and also inform the UA if its transport
address towards the EP has changed. In our case, a change of transport address
suggests a failure of ﬂow. The time interval between STUN Binding requests is a
random time between 24 and 29 seconds .
The binding refresh usage requires to multiplex STUN traﬃc on the same trans-
port address as SIP. So ﬁrst STUN messages must be separated from SIP messages.
A quite distinguishable feature of SIP packets is that all STUN messages start with
the ﬁrst byte either 0 or 1, but the ﬁrst byte of a SIP packet has never a value of 0
or 1. This may not be suﬃce if there are valid application layer data packets which
could be confused with STUN packets. STUN deﬁnes a special ﬁeld called the magic
cookie which is a ﬁxed 32-bit value, 0x2112A442. So even if the SIP packet can have
the same value with the magic cookie in its second 32 bit word, there is only a one
in 232 chances that they are the same.
For SIP over UDP, eXosip opened one UDP socket and we accessed it through
eXosip.net interface.net socket. The variable of eXosip is globally visible when
eXosip library is initiated. STUN messages are sent through this socket periodically.
To reduce processing consumption on the UA (which is a mobile phone in WeSAHMI
senario) all the registrations share the same timer. That is, when the timer expires,
the UA traverses all of its registrations and sends STUN Binding requests through
all these registration.
5.4.2 STUN server and client
On the STUN server side, the server daemon reads the buﬀer from a socket and
then checks if this is a SIP or STUN packet. If this is a STUN message, the daemon
will send the message to STUN message parser, instead of SIP parser. According to
the type of STUN requests, the SIP state machine may mark three kinds of events.
These events do not trigger any states transaction in SIP state machine. They are
just used to mark the type of non-SIP messages received from the SIP port.
New events added to the oSIP event types is shown as follows:
RCV BIND REQUEST: an incoming STUN BINDING request
RCV BIND RESPONSE: an incoming STUN BINDING response
RCV BIND ERROR RESPONSE: an incoming STUN ERROR response
So the receiver (either STUN client or server) may generate the above events,
CHAPTER 5. IMPLEMENTATION 27
after parsing the buﬀer. If it is a STUN Binding request, the server encodes the
STUN Binding response including STUN attributes and sends it over the same ﬂow.
5.4.3 STUN attributes
The following attributes may present in STUN response messages in the ﬁeld of
attributes as shown in table 5.2:
Value Name Binding Response Error Response
0x0001 MAPPED-ADDRESS *
0x0004 SOURCE-ADDRESS *
0x0005 CHANGE-ADDRESS *
0x0009 ERROR-CODE *
0x000A UNKNOWN-ATTRIBUTES *
0x0020 XOR-MAPPED-ADDRESS *
Table 5.2: STUN attributes supported by the implementation
After receiving the STUN response with any of the above attributes, the STUN
client decides its next action, by checking the attributes present in Binding response.
5.4.4 STUN retransmission mechanism
Because the UDP is connectionless transport protocol, the reliability of STUN mes-
sages is guaranteed by the STUN client retransmission mechanism. Clients should
retransmit the request starting with an interval of RTO, doubling after each
Initial value for RTO should be conﬁgurable. 3 seconds is recommended . The
value of RTO must not be rounded up to the nearest second.
The value of RTO should be cached by an agent after the completion of the
transaction, and used as the starting value for RTO for the next transaction to the
same host. The value should be considered stale and discarded after 10 minutes.
Retransmissions continue until a response is received, or a total of 7 requests have
been sent. If no response is received by 1.6 seconds after the last request has been
sent, the client should consider the ﬂow to have failed .
CHAPTER 5. IMPLEMENTATION 28
5.5 Edge proxy
For the sake of security, our system preferred to use the second algorithm as de-
scribed in section 4.3.1, since the ﬁrst algorithm can only be used if the connection
between the EP and the registrar is integrity protected. The second algorithm uses
keyed HMAC to assure the integrity of the ﬂow token. This is a cheap and eﬃcient
way to protect against malicious modiﬁcation.
When it decides to generate a ﬂow token according to the mechanism described in
section 4.3.2, the EP ﬁrst generates a 20-octet random key, and then computes the
keyed hash value of S formatted according to the ﬁgure 4.3 with the just generated
random key. By calling APIs provided by the OpenSSL library, we can get a 20-
octet message digest. The EP will only use the ﬁrst 10-octet and concatenate it
with S. The ﬁnal step is to apply base64 encoding to the string.
The validation of the token is just the reverse procedure. We base64 decode the
token and compute the HMAC of S extracted from the token. Then check if they
are identical. We implemented base64 encoding in independent ﬁles. The important
interfaces can be found in appendix A.
6.1 Experimental infrastructure deployment
The experimental environment is shown in ﬁgure 6.1, used for testing our imple-
mentation. In the initial stage, the UA is manually conﬁgured with two outbound
proxy URIs (the minimal number of URIs required in ). We ignored DNS and
location service and used IP addresses directly for the sake of simplicity. Another
open issue, left for future work, is that we did not experiment the reliability of our
system. Even though we established two direct ﬂows to the UA’s two EPs, we did
not experiment how our system would behave if the primary EP failed and it had
to use the secondary EP.
The solid bi-directional arrowed lines indicate the direct ﬂows between the UA
Figure 6.1: Experimental environment
CHAPTER 6. EXPERIMENTATION 30
Figure 6.2: Flow sequence of SIP messages
and the EP. Namely an always active UDP or TCP ﬂow. The dash bi-directional
arrowed lines indicate indirect ﬂows between the EPs and registrar, because the ﬂow
is established when needed.. We did not deploy APs, since we ignored the location
Figure 6.2 illustrates a basic registration and SUBSCRIBE/NOTIFY procedure
we experimented against our system. In following sections, we present these mes-
sages in details.
6.2 Experiment for SIP over UDP with SIP outbound
The UA registers twice to the same registrar through its primary and secondary EPs
respectively. The REGISTER requests generated by the UA are listed as follows:
These two REGISTER requests are almost the same, except for the Route headers
and the reg-id parameters in the Contact header ﬁelds, as shown in table 6.1 and
6.2. In the ﬁeld of Route header, we speciﬁed the two EPs IP addresses with two
parameters. Through this way, the REGISTER requests are proxied to these two
EPs, and the two parameters indicate EPs support loose route and STUN keepalive,
CHAPTER 6. EXPERIMENTATION 31
REGISTER sip:10.1.0.7 SIP/2.0
Via: SIP/2.0/UDP 10.1.0.10:5060;rport;branch=z9hG4bK1835142445
CSeq: 1 REGISTER
Table 6.1: REGISTER request proxied to the primary EP
that is EPs can work as STUN keepalive servers. In table 6.2, the reg-id parameter
is set to 2 in the Contact header of the SIP body sent to its secondary EP. So we
later use this parameter to identify diﬀerent ﬂows established by the same UA. This
information is recorded by the registrar with its Contact header. According to the
Supported header, we can see the UA supports Path header. So EPs can later use
this function if needed. We used a very simple authentication mechanism, adding a
Auth header to the request. The registrar is conﬁgured to recognize the value of this
ﬁeld so that other requests with diﬀerent values will be denied. A more intelligent
mechanism is expected in the future work.
After received the REGISTER requests, the two EPs proxy REGISTER requests
to the registrar and delivered responses from the registrar to the UA. The responses
received by the UA from the registrar through two EPs are listed as follows:
In table 6.3 and 6.4, we notice a new header, Path header, with three parameters
appeared in responses. That is because EPs generate and insert a ﬂow token to
CHAPTER 6. EXPERIMENTATION 32
REGISTER sip:10.1.0.7 SIP/2.0
Via: SIP/2.0/UDP 10.1.0.10:5060;rport;branch=z9hG4bK1094232440
CSeq: 2 REGISTER
Table 6.2: REGISTER request proxied to the secondary EP
the Path header, and pack the Path header to REGISTER requests. After these
actions, EPs proxy requests to the registrar. The registrar records the ﬂow token as
part of the binding information. Then the registrar forms responses by copying the
Path header, which eventually becomes the 200OK responses received by the UA.
The value of Supported header is set to outbound indicating that EPs supports SIP
After the UA receives two 200OK responses, it sends SUBSCRIBE request as
shown in table 6.5 to its content service provider, Notiﬁer, through its primary EP.
To use primary or secondary EP is decided randomly. In the case of the failure of
one EP, the UA can use another one. In our experiment, the logical Notiﬁer hosts in
the registrar physically. Comparing to the REGISTER request, a new ﬁeld aﬃliates
with the ﬁrst parameter of Route header. It is the ﬂow token the UA extracted from
the Path header of 200OK response. We do not list the response for SUBSCRIBE
request, since it is mainly the normal SIP response.
CHAPTER 6. EXPERIMENTATION 33
SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.1.0.10:5060;rport=5060;branch=z9hG4bK1835142445
CSeq: 1 REGISTER
Table 6.3: 200OK response received by the UA from the primary EP
Table 6.6 lists the NOTIFY request sent by the notiﬁer. Similarly, we notice the
ﬂow token in the Route header. This request is forwarded to the UA’s primary EP,
who sends it to its ﬁnal destination by parsing the ﬂow token to ﬁnd out the exact
6.2.1 Experiment for STUN keepalive
After the ﬁrst successful registration, we set the STUN keepalive interval to a random
time between 24 to 29 seconds. Then the UA will send STUN Binding requests
The STUN Binding request sent by the UA to its two EPs in its Hexadecimal
form. In table 6.7 we listed the parsed binary data in a human readable form. As
you can see we did not give any value for the attributes ﬁeld. This ﬁeld may be used
CHAPTER 6. EXPERIMENTATION 34
SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.1.0.10:5060;rport=5060;branch=z9hG4bK1094232440
CSeq: 2 REGISTER
Table 6.4: 200OK response received by the UA from the secondary EP
later when errors occur in STUN messages. For the rest of the STUN message, we
just padded zero to align it to 20 bytes. The data structure used in the program is
listed in appendix A.
The STUN Binding response is similar with the Binding request except for the
ﬁeld of STUN message type which is 0x0101.
6.3 Experiment TCP keepalive
TCP keepalive is supported by Linux kernel. We enable TCP keepalive feature in
our code as described in chapter 5, section 5.3. We captured TCP keepalive replies
which were the ACK set without data.
CHAPTER 6. EXPERIMENTATION 35
SUBSCRIBE sip:firstname.lastname@example.org SIP/2.0
Via: SIP/2.0/UDP 10.1.0.10:5060;rport;branch=z9hG4bK1001964849
CSeq: 20 SUBSCRIBE
Table 6.5: SUBSCRIBE request sent by the UA to its Notiﬁer.
CHAPTER 6. EXPERIMENTATION 36
NOTIFY sip:email@example.com:5060 SIP/2.0
Via: SIP/2.0/UDP 10.1.0.11:5060;rport;branch=z9hG4bK1883640899
CSeq: 21 NOTIFY
Table 6.6: NOTIFY request sent from the Notiﬁer to the UA.
Name Length Value
Header First two bits 2 bits 0
Message type 2 bytes 0x0001
Message length 2 bytes 0x0000
Magic cookie 4 bytes 0x2112A442
Table 6.7: STUN Binding request.
SIP outbound  does not specify the conﬁguration mechanism of outbound proxy
registration URIs. The conﬁguration procedure can be considered as an implemen-
tation practices issue. A trusted third party can be used to distribute the outbound-
proxy-set to UAs in the initial stage. In WeSAHMI scenario, the WeSAHMI server,
who provides a backbone for the whole platform, can be used as the third party.
Each URI in the outbound-proxy-set can be resolved to several diﬀerent physical
hosts. This means one URI represents one logical EP. But one logical EP can be
deployed to several physical hosts. Such kind of deployment enhances the scalability
and reliability, since a single server’s failure can not hinder the whole system. To
deploy the system in this fashion, DNS service is needed so that the various URIs
in the outbound proxy set can not resolve to the same host.
Every UA may have at least two and up to four logical EPs. To choose which one
to proxy requests, is not speciﬁed in the SIP outbound draft. In our implementation,
we just simply picked the primary EP to proxy requests unless it fails. But in a
large system, which has a lot of UAs, the primary EP may overload but other EPs
just run in vain.
To optimize the system, we may design a way to assign work load evenly. We
might regulate a limited number of direct ﬂows from a EP to UAs. When the ﬁxed
number is reached, the EP refuses a UA’s connection and responses with a kind
of message informing the UA to try another EP in its outbound proxy set. This
response message may use 200OK SIP response with a special header diﬀerent from
normal responses to requests. As to the value of ﬁxed number of direct ﬂows, it
should be decided after practical measurement or mathematical model.
We only implemented STUN over UDP. So client retransmission is desirable to
achieve reliability. The STUN is transparent to transport protocols. So it is possible
CHAPTER 7. DISCUSSION 38
to implement it over TCP. If we implement STUN over TCP, we do not need to add
client retransmission to STUN, since TCP is connection oriented.
In this thesis, we addressed SIP outbound protocol and its applications. Then
we described our implementation of SIP outbound as a component of WeSAHMI
system. SIP outbound, as an extension of SIP, updates several behaviors of general
SIP. It makes the traverse behind NAT possible. And then we described how our
implementation was integrated to the WeSAHMI architecture and how it worked
with the whole system. In the end, we designed several experiments for evaluation
of our implementation. The experiments are mainly about client initiated connection
features of SIP outbound and keepalive mechanisms.
During the procedure of implementation, most diﬃculties we encountered were
the lack of documentation for these open source libraries, including eXosip and
oSIP. This may be the common problem for most open source developers. Our
implementation is built in the application level of these two libraries , so only to
know what kind of application programming interfaces (APIs) they provide is enough
for us. But the documents are not clear and suﬃcient, about how to use these APIs
so that we had to inspect the source code thoroughly. It was time consuming to go
through such a big bunch of source code. However, this is good for us to learn how
the SIP transaction was implemented in the library. After learning these knowledge,
we may later be able to integrate all the SIP outbound features to the library. So
other application developers can use the library to build SIP application which
supports SIP outbound extension directly.
CHAPTER 8. CONCLUSIONS 40
8.2 Future work
Our implementation only realized STUN keepalive over UDP and enabled TCP
keepalive in the kernel. The  also proposed CRLF keepalive. To make our system
more intelligent, in future, we may entitle the UA to select a keepalive approach
according to its transport protocol and preferences.
In our experimentation, we colocated the registrar and notiﬁer on one physical
host. For the logical registrar, we stored the binding information to random memory
instead of a database or any hardware. It was just a temporary solution for the
registrar which should be improved in the future. To write the binding information
to ﬁles, we need to consider how to format information to make the information
easy to lookup.
Since STUN keepalive is transport to transport protocol, we may also extend it
over TCP connection. Reasonable performance evaluation may be done as compar-
ison to the kernel enabled TCP keepalive. We may also implement client STUN
retransmission mechanism for STUN over UDP to achieve higher reliability.
Scalability is also expected for the SIP outbound system. To achieve high scala-
bility and failure tolerance, multiple physical hosts may be deployed for one logical
EP entity. This may need extra mechanism such as DNS SRV . Moreover, a indi-
vidual timer for each registration should be set when the registrar does its binding
SIP outbound also mentions about SigComp compression . When SigComp
is applied, both two communicating endpoints need to perform compression and
depression. This feature will be desirable, since the SIP message may reach up to
two thousand bytes or more which is too large for wireless transmission.
 Understanding SIP. Internet, 2007. www.sipcenter.com/sip.nsf/.
 WeSAHMI System Speciﬁcation, 2007.
 P. Vixie A. Gulbrandsen and L. Esibov. A DNS RR for specifying the location
of services (DNS SRV). Network Working Group, 2000.
 Shoma Chakravarty Abhijit Sur, Dean Skidmore. Web services based SOA for
next generation telecom networks. In IEEE international conference on services
computing, page 520, 2006.
 R. Mahy C. Jennigns. Managing Client Initiated Connections in the Session
Initiation Protocol. Internet Draft (work in progress), Internet Engineering
Task Force, 2007.
 Marina del Rey. Internet Protocol. Network Working Group, September, 1981.
 N. Modadugu E. Rescorla. Datagram Transport Layer Security. Network Work-
ing Group, April, 2006.
 J. Rosenberg et al. SIP: Session Initiation Protocol RFC 3261. Internet Engi-
neering Task Force, 2002.
 V. Perkins C. Handley, M. Jacobson. SDP: Session Description Protocol. Net-
work Working Group, July, 2006.
 Alan B. Johnston Henry Sinnreich. Internet Communication Using SIP. 1th
edition, October, 2001.
 P. Matthews D. Wing J. Rosenberg, R. Mahy. Traversal Using Relays around
NAT (TURN): Relay Extensions to Session Traversal Utilities for NAT (STUN).
Internet Engineering Task Force, 2007.
 R. Mahy J. Rosenberg, C. Huitema and D. Wing. Simple Traversal Under-
neath Network Address Translators (NAT) (STUN). Internet Draft (work in
progress), Internet Engineering Task Force, 2006.
 K. Johns. Routing of mid dialog requests using sip-outbound. Internet Draft
(work in progress), Internet Engineering Task Force, 2006.
 Alan B. Johnston. Understanding the Session Initiation Protocol. 1th edition,
 S. Josefsson. The Base16, Base32, and Base64 Data Encodings RFC 3548.
Internet Draft (work in progress), Internet Engineering Task Force, 2006.
 H. Krawczyk. HMAC: Keyed-Hashing for Message Authentication RFC 2104.
Internet Engineering Task Force, 1997.
 Kundan Singh Milind Buddhikot, Adiseshu Hari and Scott Miller. MobileNAT:
A New Technique for Mobility Across Heterogeneous Address Spaces. Mobile
Networks and Applications, 10(3), 2005.
 David L. Mills. Computer Network Time Synchronization: The Network Time
Protocol. 1th edition, March, 2006.
 R. Salz P. Leach, M. Mealling. A Universally Unique iDentiﬁer (UUID) URN
Namespace RFC 4122. Internet Engineering Task Force, 2005.
 M. Holdrege P. Srisuresh. IP network address translator (NAT) terminology
and considerations RFC 2663. Network Working Group, 1999.
 M. Holdrege P. Srisuresh. IP Network Address Translator (NAT) Terminology
and Considerations. Internet Draft (work in progress), Internet Engineering
Task Force, August, 1999.
 Jonathan B. Postel. Simple Mail Transfer Protocol. Network Working Group,
 J. Mogul H. Frystyk L. Masinter P. Leach T. Berners-Lee R. Fielding, J. Gettys.
Hypertext Transfer Protocol–HTTP/1.1. Network Working Group, June, 1999.
 J. Christoﬀersson H. Hannu R. Price, C. Bormann and Z. Liu. Signaling Com-
pression (SigComp). Network Working Group, January 2003.
 Howard Rheingold. Smart Mobs: The Next Social Revolution. 1th edition,
 J. Rosenberg. Interactive Connectivity Establishment (ICE): A Methology for
Network Address Translator (NAT) Traversal for Oﬀer/Answer Protocols. In-
ternet Draft (work in progress), Internet Engineering Task Force, 2005.
 J. Rosenberg. Interactive Connectivity Establishment (ICE): A Protocol for
Network Address Translator (NAT) Traversal for Oﬀer/Answer Protocols. In-
ternet Draft (work in progress), Internet Engineering Task Force, 2007.
 Yutaka Takeda Saikat Guha and Paul Francis. NUTSS: A SIP-based Approach
to UDP and TCP Network Connectivity. ACM SIGCOMM, 2004.
 What is SIP? Internet, 2007. http://www.sipcenter.com/sip.nsf/html/Background.
 Robert Sparks. SIP Basics and Beyond. ACM Press, 2007.
 W. Richard Stevens. UNIX Network Programming Volume 1 Networking APIs:
Sockets and XTI. 2th edition, January, 1998.
 E. Rescorla T. Dierks. The Transport Layer Security (TLS) Protocol Version
1.1. Network Working Group, April, 2006.
 M. Allman V. Paxson. Computing TCP’s Retransmission Timer RFC 2988.
Internet Engineering Task Force, 2000.
 Samir Chatterjee Victor Paulsamy. Network Convergence and the
NAT/Firewall Problems. In System Sciences, 2003. Proceedings of the 36th
Annual Hawaii International Conference, page 10, 2003.
 D. Willis and B. Hoeneisen. Session Initiation Protocol (SIP) Extension Header
Field for Registering Non-Adjacent Contacts RFC 3327. Internet Engineering
Task Force, 2002.
A.1 Important data structures
STUN message header data structure and STUN message data structure:
APPENDIX A. APPENDIX 45
A.2 Important modiﬁcations to the eXosip and osip li-
Library File name Function name
eXosip eXconf.c eXosip keep alive
eXregister api.c eXosip register send register
eXtransport.c eXosip tcp connect socket
udp.c eXosip read message
stun.c; stun.h new ﬁles
base64.c; base64.h new ﬁles
osip osipevent.c osip message parse
osip message parse.c pro stunmsg; compare addr
Table A.1: Modiﬁcation to eXosip and osip libraries
A.3 APIs for base64 encoding
void base64_encode (const unsigned char *in, size_t inlen,
unsigned char *out, size_t outlen)
bool base64_decode (const unsigned char *in, size_t inlen,
unsigned char *out, size_t *outlen)
A.4 APIs for STUN keepalive
int stun_parse_message( char* buf, unsigned int bufLen,
APPENDIX A. APPENDIX 46
stun_msg_t *pmsg, int verbose)
unsigned int stun_encode_message( const stun_msg_t msg, char* buf,
unsigned int bufLen, int verbose)