VoIP Terminology and Concepts
A P V
• Anti-tromboning • Packet Loss • Vamming
Concealment • Vishing
B • PacketCable • VoIP VPN
• Purple minutes • VoIP phone
• Back-to-back user agent • VoIP spam
R • Voice chat
C • Voice over
• Real-time Transport IP
• Call origination Protocol
• Chatter bug • Voice
• Downstream QoS • Session Border
I • Session Initiation
• IP Multimedia Subsystem • Signaling gateway
• Internet telephony service • Soft phone
• Traversal Using Relay
• Lawful interception NAT
• Media Gateway Control
Retrieved from "http://en.wikipedia.org/wiki/Category:VoIP_terminology_
DHN 9-18-07 1
Anti-tromboning (Also referred to as Anti-Hairpinning or Media Release) is a feature
employed in Voice over IP networks that optimizes the use of the access network. A
Session Border Controller handling calls as they pass from the Access Network to the
Core Network can examine the IP Address of both the caller and called parties and if they
reside in the same part of the network the media path can be “released” allowing media to
flow directly between the two parties without entering the access network. The benefits
of this action are twofold: 1) the Caller is not paying for any bandwidth usage on the
carrier network and 2) The carrier's network is less congested.
DHN 9-18-07 2
The Back-to-Back User Agent (B2BUA) acts as a user agent to both ends of a Session
Initiation Protocol (SIP) call. The B2BUA is responsible for handling all SIP signalling
between both ends of the call, from call establishment to termination. Each call is tracked
from beginning to end, allowing the operators of the B2BUA to offer value-added
features to the call.
To SIP clients, the B2BUA acts as a User Agent server on one side and as a User Agent
client on the other (back-to-back) side. The basic implementation of a B2BUA is defined
in RFC 3261. The B2BUA may provide the following functionalities:
• call management (billing, automatic call disconnection, call transfer, etc.)
• network interworking (perhaps with protocol adaptation)
• hiding of network internals (private addresses, network topology, etc.)
• codec translation between two call legs
Because it maintains call state for all SIP calls it handles, failure of a B2BUA affects all
these calls. Often, B2BUAs also terminate and bridge the media streams to have full
control over the whole session.
A Signaling gateway, part of a Session Border Controller, or Asterisk PBX are good
examples of a B2BUA.
DHN 9-18-07 3
Call Origination, also known as voice origination, refers to the collecting of the calls
initiated by a calling party on a telephone exchange of PSTN, and handing off the calls to
a VoIP endpoint or to another exchange or telephone company for completion to a called
In the VoIP world, the opposite of call origination is call termination, where a call
initiated as a VoIP call is terminated to the PSTN.
The term is often used in referring to a VoIP trunking service.
DHN 9-18-07 4
Chatter Bug is a service and hardware device used to route long distance voice calls
over an IP network.
Small hardware device that plugs in between the telephone and phone line. Detects and
automatically routes long distance calls to the Chatter Bug VoIP service without the need
of a separate or high speed internet provider.
Routes calls using a VoIP network to terminate calls to normal telephones.
DHN 9-18-07 5
Downstream QoS (see Qos) is a technology innovation that enhances VoIP calls by
improving the clarity of incoming voice. When data floods the caller's downstream
Internet-access line, Downstream QoS can throttle incoming data to ensure time-sensitive
voice traffic gets through promptly to the listener. One notable version of the technology
was pioneered by Patton Electronics Co. for their SmartNode brand of VoIP equipment.
Other versions can be found in VoIP equipment from Cisco Systems and application
software from Packeteer.
Standard Upstream QoS. Upstream QoS improves voice quality for the remote side of a
VoIP call by transmitting upstream voice traffic at a higher priority than data. Applying
upstream QoS at both ends of the call improves voice quality for both users. Yet
upstream QoS still leaves the caller vulnerable to downstream data surges. During large
file downloads voice quality may degrade.
Some Internet providers may offer services similar to Downstream QoS as a higher-
priced, premium service. VoIP systems that combine upstream and downstream QoS
mechanisms, reduce the user's dependence on the service provider while providing local
control over total voice quality.
Retrieved from "http://en.wikipedia.org/wiki/Downstream_QoS"
DHN 9-18-07 6
The IP Multimedia Subsystem (IMS) is an architectural framework for delivering
internet protocol (IP) multimedia to mobile users. It was originally designed by the
wireless standards body 3rd Generation Partnership Project (3GPP), and is part of the
vision for evolving mobile networks beyond GSM. Its original formulation (3GPP R5)
represented an approach to delivering "Internet services" over GPRS. This vision was
later updated by 3GPP, 3GPP2 and TISPAN by requiring support of networks other than
GPRS, such as Wireless LAN, CDMA2000 and fixed line.
To ease the integration with the Internet, IMS as far as possible uses IETF (i.e. Internet)
protocols such as Session Initiation Protocol (SIP). According to the 3GPP, IMS is not
intended to standardize applications itself but to aid the access of multimedia and voice
applications across wireless and wireline terminals, i.e. aid a form of fixed mobile
convergence (FMC). This is done by having a horizontal control layer that isolates the
access network from the service layer. Services need not have their own control
functions, as the control layer is a common horizontal layer.
Alternative and overlapping technologies for access and provision of services across
wired and wireless networks depend on the actual requirements, and include
combinations of Generic Access Network, soft switches and "naked" SIP. This makes the
business use of IMS less appealing. It is easier to sell services than to sell the virtues of
"integrated services". But, services for IMS have not been prolific.
Since IMS was conceived years ago, it is becoming increasingly easier to access content
and contacts using mechanisms outside the control of traditional wireless/fixed operators,
and so those operators are likely to reconsider their strategies . Although it is expected
that eventually IP will be available on all mobile phones and operators, it is not clear how
much of the 3GPP/3GPP2/TISPAN IMS as it exists today will be deployed. "Early IMS"
might be used in IMS implementations that do not yet support all "Full IMS"
requirements, although it's not clearly defined what differences there might be (IPv4
support instead of IPv6 is often mentioned).
DHN 9-18-07 7
• 1 History
• 2 Architecture
o 2.1 Access network
o 2.2 Core network
2.2.1 Home subscriber server
220.127.116.11 User identities
2.2.2 Call/session control
2.2.3 Application servers
2.2.4 Media Servers
2.2.5 Breakout Gateway
2.2.6 PSTN Gateways
2.2.7 Media Resources
o 2.3 Charging
o 2.4 Interfaces description
• 3 Security aspects of early IMS systems
• 4 Specifications
o 4.1 3GPP Specs
o 4.2 IETF Specs
• 5 See also
• 6 References
• 7 External links
• 8 Books
• IMS was originally defined by an industry forum called 3G.IP, formed in 1999.
3G.IP developed the initial IMS architecture, which was brought to the 3rd
Generation Partnership Project (3GPP), as part of their standardization work for
3G mobile phone systems in UMTS networks. It first appeared in release 5
(evolution from 2G to 3G networks), when SIP-based multimedia was added.
Support for the older GSM and GPRS networks was also provided.
• 3GPP2 (a different organization) based their CDMA2000 Multimedia Domain
(MMD) on 3GPP IMS, adding support for CDMA2000.
• 3GPP release 6 added interworking with WLAN.
• 3GPP release 7 added support for fixed networks, by working together with
TISPAN release R1.1.
DHN 9-18-07 8
3GPP / TISPAN IMS Architectural Overview
The IP Multimedia Core Network Subsystem is a collection of different functions, linked
by standardized interfaces, which grouped form one IMS administrative network. A
function is not a node (hardware box): an implementer is free to combine 2 functions in 1
node, or to split a single function into 2 or more nodes. Each node can also be present
multiple times in a single network, for load balancing or organizational issues.
The user can connect to an IMS network in various ways, all of which use the standard
Internet Protocol (IP). Direct IMS terminals (such as mobile phones, personal digital
assistants (PDAs) and computers) can register directly on an IMS network, even when
they are roaming in another network or country (the visited network). The only
requirement is that they can use IPv6 (also IPv4 in early IMS) and run Session Initiation
Protocol (SIP) user agents. Fixed access (e.g., Digital Subscriber Line (DSL), cable
modems, Ethernet), mobile access (e.g. W-CDMA, CDMA2000, GSM, GPRS) and
wireless access (e.g. WLAN, WiMAX) are all supported. Other phone systems like plain
old telephone service (POTS -- the old analogue telephones), H.323 and non IMS-
compatible VoIP systems, are supported through gateways.
Home subscriber server
The Home Subscriber Server (HSS), or User Profile Server Function (UPSF), is a master
user database that supports the IMS network entities that actually handle calls. It contains
the subscription-related information (user profiles), performs authentication and
authorization of the user, and can provide information about the user's physical location.
It is similar to the GSM Home Location Register (HLR) and Authentication Centre
An SLF (Subscriber Location Function) is needed to map user addresses when multiple
HSSs are used. Both the HSS and the SLF communicate through the DIAMETER
DHN 9-18-07 9
Normal 3GPP networks use the following identities:
• International Mobile Subscriber Identity (IMSI)
• Temporary Mobile Subscriber Identity (TMSI)
• International Mobile Equipment Identity (IMEI)
• Mobile Subscriber ISDN Number (MSISDN)
IMSI is a unique phone identity that is stored in the SIM. To improve privacy, a TMSI is
generated per geographical location. While IMSI/TMSI are used for user identification,
the IMEI is a unique device identity and is phone specific. The MSISDN is the telephone
number of a user.
IMS also requires IP Multimedia Private Identity (IMPI) and IP Multimedia Public
Identity (IMPU). Both are not phone numbers or other series of digits, but Uniform
Resource Identifier (URIs), that can be digits (a tel-uri, like tel:+1-555-123-4567) or
alphanumeric identifiers (a sip-uri, like sip:firstname.lastname@example.org). There can be
multiple IMPU per IMPI (often a tel-uri and a sip-uri). The IMPU can also be shared with
another phone, so both can be reached with the same identity (for example, a single
phone-number for an entire family).
The HSS user database contains, the IMPU, IMPI, IMSI, and MSISDN and other
DHN 9-18-07 10
Several roles of Session Initiation Protocol (SIP) servers or proxies, collectively called
Call Session Control Function (CSCF), are used to process SIP signalling packets in the
• A Proxy-CSCF (P-CSCF) is a SIP proxy that is the first point of contact for the
IMS terminal. It can be located either in the visited network (in full IMS
networks) or in the home network (when the visited network isn't IMS compliant
yet). Some networks may use a Session Border Controller for this function. The
terminal discovers its P-CSCF with either DHCP, or it is assigned in the PDP
Context (in General Packet Radio Service (GPRS).
o it is assigned to an IMS terminal during registration, and does not change
for the duration of the registration
o it sits on the path of all signalling messages, and can inspect every
o it authenticates the user and establishes an IPsec security association with
the IMS terminal. This prevents spoofing attacks and replay attacks and
protects the privacy of the user. Other nodes trust the P-CSCF, and do not
have to authenticate the user again.
o it can also compress and decompress SIP messages using SigComp, which
reduces the round-trip over slow radio links
o it may include a Policy Decision Function (PDF), which authorizes media
plane resources e.g. quality of service (QoS) over the media plane. It's
used for policy control, bandwidth management, etc. The PDF can also be
a separate function.
o it also generates charging records
• A Serving-CSCF (S-CSCF) is the central node of the signalling plane. It is a SIP
server, but performs session control too. It is always located in the home network.
It uses DIAMETER Cx and Dx interfaces to the HSS to download and upload
user profiles — it has no local storage of the user. All necessary information is
loaded from the HSS.
o it handles SIP registrations, which allows it to bind the user location (e.g.
the IP address of the terminal) and the SIP address
o it sits on the path of all signaling messages, and can inspect every message
o it decides to which application server(s) the SIP message will be
forwarded, in order to provide their services
o it provides routing services, typically using Electronic Numbering
o it enforces the policy of the network operator
o there can be multiple S-CSCFs in the network for load distribution and
high availability reasons. It's the HSS that assigns the S-CSCF to a user,
when it's queried by the I-CSCF.
DHN 9-18-07 11
• An I-CSCF (Interrogating-CSCF) is another SIP function located at the edge of
an administrative domain. Its IP address is published in the Domain Name System
(DNS) of the domain (using NAPTR and SRV type of DNS records), so that
remote servers can find it, and use it as a forwarding point (e.g. registering) for
SIP packets to this domain. The I-CSCF queries the HSS using the DIAMETER
Cx interface to retrieve the user location (Dx interface is used from I-CSCF to
SLF to locate the needed HSS only), and then routes the SIP request to its
assigned S-CSCF. Up to Release 6 it can also be used to hide the internal network
from the outside world (encrypting part of the SIP message), in which case it's
called a THIG (Topology Hiding Inter-network Gateway). From Release 7
onwards this "entry point" function is removed from the I-CSCF and is now part
of the IBCF (Interconnection Border Control Function). The IBCF is used as
gateway to external networks, and provides NAT and Firewall functions
Application servers (AS) host and execute services, and interface with the S-CSCF using
Session Initiation Protocol (SIP). An example of an application server that is being
developed in 3GPP is the Voice call continuity Function (VCC Server). Depending on the
actual service, the AS can operate in SIP proxy mode, SIP UA (user agent) mode or SIP
B2BUA (back-to-back user agent) mode. An AS can be located in the home network or
in an external third-party network. If located in the home network, it can query the HSS
with the DIAMETER Sh interface (for a SIP-AS) or the Mobile Application Part (MAP)
interface (for IM-SSF).
• SIP AS: native IMS application server
• IM-SSF: an IP Multimedia Service Switching Function interfaces with
Customized Applications for Mobile networks Enhanced Logic (CAMEL)
Application Servers using Camel Application Part (CAP)
The MRF (Media Resource Function) provides media related functions such as media
manipulation (e.g. voice stream mixing) and playing of tones and announcements.
Each MRF is further divided into a Media Resource Function Controller (MRFC) and a
Media Resource Function Processor (MRFP).
• The MRFC is a signalling plane node that acts as a SIP User Agent to the S-
CSCF, and which controls the MRFP with a H.248 interface
• The MRFP is a media plane node that implements all media-related functions.
DHN 9-18-07 12
A BGCF (Breakout Gateway Control Function) is a SIP server that includes routing
functionality based on telephone numbers. It is only used when calling from the IMS to a
phone in a circuit switched network, such as the Public Switched Telephone Network
(PSTN) or the Public land mobile network (PLMN).
A PSTN/CS gateway interfaces with PSTN circuit switched (CS) networks. For
signalling, CS networks use ISDN User Part (ISUP) (or BICC) over Message Transfer
Part (MTP), while IMS uses Session Initiation Protocol (SIP) over IP. For media, CS
networks use Pulse-code modulation (PCM), while IMS uses Real-time Transport
• A Signalling Gateway (SGW) interfaces with the signalling plane of the CS. It
transforms lower layer protocols as Stream Control Transmission Protocol (SCTP,
an Internet Protocol (IP) protocol) into Message Transfer Part (MTP, an
Signalling System 7 (SS7) protocol), to pass ISDN User Part (ISUP) from the
MGCF to the CS network.
• A Media Gateway Controller Function (MGCF) does call control protocol
conversion between SIP and ISUP and interfaces with the SGW over SCTP. It
also controls the resources in an MGW with an H.248 interface.
• A Media Gateway (MGW) interfaces with the media plane of the CS network, by
converting between RTP and PCM. It can also transcode when the codecs don't
match (e.g. IMS might use AMR, PSTN might use G.711).
Media Resources are those components that operate on the media plane and are under the
control of IMS Core functions. Specifically, Media Server (MS) and Media gateway
DHN 9-18-07 13
Offline charging is applied to users who pay for their services periodically (e.g., at the
end of the month). Online charging, also known as credit-based charging, is used for
prepaid services, or real-time credit control of postpaid services. Both may be applied to
the same session.
• Offline Charging : All the SIP network entities (P-CSCF, I-CSCF, S-CSCF,
BGCF, MRFC, MGCF, AS) involved in the session use the DIAMETER Rf
interface to send accounting information to a CCF (Charging Collector Function)
located in the same domain. The CCF will collect all this information, and build a
CDR (Call Detail Record), which is sent to the billing system (BS) of the domain.
Each session carries an ICID (IMS Charging Identifier) as a unique identifier. IOI
(Inter Operator Identifier) parameters define the originating and terminating
Each domain has its own charging network. Billing systems in different domains
will also exchange information, so that roaming charges can be applied.
• Online charging: The S-CSCF talks to an SCF (Session Charging Function)
which looks like a regular SIP application server. The SCF can signal the S-CSCF
to terminate the session when the user runs out of credits during a session. The AS
and MRFC use the DIAMETER Ro interface towards an ECF (Event Charging
o When IEC (Immediate Event Charging) is used, a number of credit units
are immediately deducted from the user's account by the ECF and the
MRFC or AS is then authorized to provide the service. The service is not
authorized when not enough credit units are available.
o When ECUR (Event Charging with Unit Reservation) is used, the ECF
first reserves a number of credit units in the user's account and then
authorizes the MRFC or the AS. After the service is over, the number of
spent credit units is reported and deducted from the account; the reserved
credit units are then cleared.
DHN 9-18-07 14
IMS entities Description Protocol
Used by MRFC to fetch documents
Cr MRFC, AS dedicated
(scripts and other resources) from an AS
I-CSCF, S- Used to communicate between I-CSCF/
CSCF, HSS S-CSCF and HSS
SIP AS, OSA,
Used by AS to find a correct HSS in a
Dh SCF, IM-SSF, DIAMETER
I-CSCF, S- Used by I-CSCF/S-CSCF to find a
CSCF, SLF correct HSS in a multi-HSS environment
Used to exchange messages between UE
Gm UE, P-CSCF SIP
Allows operators to control QoS in a
user plane and exchange charging
Go PDF, GGSN DIAMETER
correlation information between IMS
and GPRS network
Used to exchange policy decisions-
Gq P-CSCF, PDF related information between P-CSCF DIAMETER
S-CSCF, I- Used to exchange messages between
CSCF, AS CSCF and AS
Used to directly forward SIP requests
Ma I-CSCF -> AS which are destinated to a Public Service SIP
Identity hosted by the AS
MGCF converts ISUP signalling to SIP
MGCF -> I-
Mg signalling and forwards SIP signalling to SIP
S-CSCF -> Used to exchange messages between S-
BGCF CSCF and BGCF
Mj BGCF -> Used to exchange messages between SIP
MGCF BGCF and MGCF in the same IMS
DHN 9-18-07 15
BGCF -> Used to exchange messages between
BGCF BGCFs in different IMS networks
Used for exchanging messages between
Mm CSCF, external Not specified
IMS and external IP networks
Mn Allows control of user-plane resources H.248
Used to exchange messages between
Mp MRFC, MRFP H.248
MRFC and MRFP
S-CSCF, Used to exchange messages between S-
MRFC CSCF and MRFC
P-CSCF, I- Used to exchange messages between
CSCF, S-CSCF CSCFs
SIP AS, OSA Used to exchange information between
SCS, HSS SIP AS/OSA SCS and HSS
Used to exchange information between
Si IM-SSF, HSS MAP
IM-SSF and HSS
Used by MRFC to fetch documents
Sr MRFC, AS HTTP
(scripts and other resources) from an AS
UE, AS (SIP
Enables UE to manage information
Ut AS, OSA SCS, HTTP(s)
related to his services
Security aspects of early IMS systems
It is envisaged that security defined in TS 33.203 may not be available for a while
especially because of the lack of USIM/ISIM interfaces and prevalence of devices that
support IPv4. For this situation, to provide some protection against the most significant
threats, 3GPP defines some security mechanisms, which are informally known as "early
IMS security", in TR33.978.
They can be downloaded from http://www.3gpp.org/specs/numbering.htm . The list
below is a small selection.
DHN 9-18-07 16
• TR 21.905 Vocabulary for 3GPP Specifications
• TS 22.066 Support of Mobile Number Portability (MNP); Stage 1
• TS 22.101 Service Aspects; Service Principles
• TS 22.141 Presence Service; Stage 1
• TS 22.228 Service requirements for the IP multimedia core network subsystem;
• TS 22.250 IMS Group Management; Stage 1
• TS 22.340 IMS Messaging; Stage 1
• TR 22.800 IMS Subscription and access scenarios
• TS 23.002 Network Architecture
• TS 23.003 Numbering, Addressing and Identification
• TS 23.008 Organization of Subscriber Data
• TS 23.107 Quality of Service (QoS) principles
• TS 23.125 Overall high level functionality and architecture impacts of flow based
charging; Stage 2
• TS 23.141 Presence Service; Architecture and functional description; Stage 2
• TS 23.167 IMS emergency sessions
• TS 23.207 End-to-end QoS concept and architecture
• TS 23.218 IMS session handling; IM call model; Stage 2
• TS 23.221 Architectural Requirements
• TS 23.228 IMS stage 2
• TS 23.234 WLAN interworking
• TS 23.271 Location Services (LCS); Functional description; Stage 2
• TS 23.278 Customized Applications for Mobile network Enhanced Logic
(CAMEL) - IMS interworking; Stage 2
• TR 23.864 Commonality and interoperability between IMS core networks
• TR 23.867 IMS emergency sessions
• TR 23.917 Dynamic policy control enhancements for end-to-end QoS, Feasibility
• TR 23.979 3GPP enablers for Push-to-Talk over Cellular (PoC) services; Stage 2
• TR 23.981 Interworking aspects and migration scenarios for IPv4-based IMS
implementations (early IMS)
• TS 24.141 Presence Service using the IMS Core Network subsystem; Stage 3
• TS 24.147 Conferencing using the IMS Core Network subsystem
• TS 24.228 Signalling flows for the IMS call control based on SIP and SDP; Stage
• TS 24.229 IMS call control protocol based on SIP and SDP; Stage 3
• TS 24.247 Messaging using the IMS Core Network subsystem; Stage 3
• TS 26.235 Packet switched conversational multimedia applications; Default
DHN 9-18-07 17
• TS 26.236 Packet switched conversational multimedia applications; Transport
• TS 29.162 Interworking between the IMS and IP networks
• TS 29.163 Interworking between the IMS and Circuit Switched (CS) networks
• TS 29.198 Open Service Architecture (OSA)
• TS 29.207 Policy control over Go interface
• TS 29.208 End-to-end QoS signalling flows
• TS 29.209 Policy control over Gq interface
• TS 29.228 IMS Cx and Dx interfaces : signalling flows and message contents
• TS 29.229 IMS Cx and Dx interfaces based on the Diameter protocol; Protocol
• TS 29.278 CAMEL Application Part (CAP) specification for IMS
• TS 29.328 IMS Sh interface : signalling flows and message content
• TS 29.329 IMS Sh interface based on the Diameter protocol; Protocol details
• TR 29.962 Signalling interworking between the 3GPP SIP profile and non-3GPP
• TS 31.103 Characteristics of the IMS Identity Module (ISIM) application
• TS 32.240 Telecommunication management; Charging management; Charging
architecture and Principles
• TS 32.260 Telecommunication management; Charging management; IMS
• TS 32.299 Telecommunication management; Charging management; Diameter
• TS 32.421 Telecommunication management; Subscriber and equipment trace:
Trace concepts and requirements
• TS 33.102 3G security; Security architecture
• TS 33.108 3G security; Handover interface for Lawful Interception (LI)
• TS 33.141 Presence service; security
• TS 33.203 3G security; Access security for IP-based services
• TS 33.210 3G security; Network Domain Security (NDS); IP network layer
• TR 33.978 Security aspects of early IP Multimedia Subsystem (IMS)
• RFC 2327 Session Description Protocol (SDP)
• RFC 2748 Common Open Policy Server protocol (COPS)
• RFC 2782 a DNS RR for specifying the location of services (SRV)
• RFC 2806 URLs for telephone calls (TEL)
• RFC 2915 the naming authority pointer DNS resource record (NAPTR)
• RFC 2916 E.164 number and DNS
• RFC 3087 Control of Service Context using SIP Request-URI
DHN 9-18-07 18
• RFC 3261 Session Initiation Protocol (SIP)
• RFC 3262 reliability of provisional responses (PRACK)
• RFC 3263 locating SIP servers
• RFC 3264 an offer/answer model with the Session Description Protocol
• RFC 3265 SIP-Specific Event Notification
• RFC 3310 HTTP Digest Authentication using Authentication and Key Agreement
• RFC 3311 update method
• RFC 3312 integration of resource management and SIP
• RFC 3319 DHCPv6 options for SIP servers
• RFC 3320 signalling compression (SigComp)
• RFC 3323 a privacy mechanism for SIP
• RFC 3324 short term requirements for network asserted identity
• RFC 3325 private extensions to SIP for asserted identity within trusted networks
• RFC 3326 the reason header field
• RFC 3327 extension header field for registering non-adjacent contacts (path
• RFC 3329 security mechanism agreement
• RFC 3420 Internet Media Type message/sipfrag
• RFC 3428 SIP Extension for Instant Messaging
• RFC 3455 private header extensions to SIP for 3GPP
• RFC 3485 SIP and SDP static dictionary for signaling compression
• RFC 3515 the SIP REFER method
• RFC 3550 Real-time Transport Protocol (RTP)
• RFC 3574 Transition Scenarios for 3GPP Networks
• RFC 3588 DIAMETER base protocol
• RFC 3589 DIAMETER command codes for 3GPP release 5 (informational)
• RFC 3608 extension header field for service route discovery during registration
• RFC 3665 SIP Basic Call Flow Examples
• RFC 3680 SIP event package for registrations
• RFC 3725 best current practices for Third Party Call Control (3pcc) in SIP
• RFC 3824 using E164 numbers with SIP
• RFC 3840 indicating user Agent Capabilities in SIP
• RFC 3841 caller preferences for SIP
• RFC 3842 SIP event package for message waiting indication and summary
• RFC 3856 SIP event package for presence
• RFC 3857 SIP event template-package for watcher info
• RFC 3858 XML based format for watcher information
• RFC 3891 the SIP Replaces Header
• RFC 3903 SIP Extension for Event State Publication
• RFC 3911 the SIP Join Header
DHN 9-18-07 19
• RFC 4028 session timers in SIP
• RFC 4235 an INVITE-Initiated dialog event package for SIP
• RFC 4475 Session Initiation Protocol (SIP) Torture Test Messages
DHN 9-18-07 20
An ITSP (Internet Telephony Service Provider) offers an Internet data service for making
telephone calls using VoIP (Voice over IP) technology. Most ITSPs use SIP, H.323, or
IAX (although H.323 use is declining) for transmitting telephone calls as IP data packets.
Customers may use traditional telephones with an analog telephony adapter (ATA)
providing RJ11 to Ethernet connection.
In the United States, net2Phone began offering consumer VoIP service in 1995.
Before 2003, many VoIP services required customers to make and receive phone calls
through a personal computer on a LAN.
ITSPs are also known as VSP (Voice Service Provider) or simply VoIP Providers.
DHN 9-18-07 21
Lawful interception (aka wiretapping) is the interception of telecommunications by law
enforcement authorities (LEA's) and intelligence services, in accordance with local law
and after following due process and receiving proper authorization from competent
With the existing Public Switched Telephone Network (PSTN), Wireless, and Cable
Systems, Lawful Interception (LI) is generally performed by accessing the digital
switches supporting the target's calls in response to a warrant from a Law Enforcement
Agency (LEA). However, mobile phone and Voice over IP (VoIP) technologies have
enabled the mobility of the end-user, which have introduced new challenges.
Whilst the detailed requirements for LI differ from one jurisdiction to another, the general
requirements are the same. The LI system must provide transparent interception of
specified traffic only, and the subject must not be aware of the interception. The service
provided to other users must not be affected during interception.
DHN 9-18-07 22
• 1 Technical description
• 2 Laws
o 2.1 United States of America
o 2.2 Europe
o 2.3 Elsewhere
• 3 Illegal Use
• 4 References
• 5 See also
• 6 External links
DHN 9-18-07 23
Almost all countries have LI requirements and have adopted global LI requirements and
standards developed by the European Telecommunications Standards Institute (ETSI)
organization. In the USA, the requirements are governed by the Communications
Assistance for Law Enforcement Act (CALEA). For an overview of laws and standards,
see the Global LI Industry Forum site.
In order to prevent investigations being compromised, LI systems may be designed in a
manner that hides the interception from the telecommunications operator concerned. This
is a requirement in some jurisdictions.
To ensure systematic procedures for carrying out interception, while also lowering the
costs of interception solutions, industry groups and government agencies worldwide have
attempted to standardize the technical processes behind lawful interception. One
organization, ETSI, has been a major driver in lawful interception standards not only for
Europe, but worldwide. The following figure provides a generalized view of the lawful
interception architecture as proposed by ETSI:
DHN 9-18-07 24
This architecture attempts to define a systematic and extensible means by which network
operators and law enforcement agents (LEAs) can interact, especially as networks grow
in sophistication and scope of services. Note this architecture applies to not only
“traditional” wireline and wireless voice calls, but to IP-based services such as Voice
over IP, email, instant messaging, etc. The architecture is now applied worldwide (in
some cases with slight variations in terminology), including in the United States in the
context of CALEA conformance. Three stages are called for in the architecture: 1)
collection where target-related “call” data and content are extracted from the network; 2)
mediation where the data is formatted to conform to specific standards; and 3) delivery of
the data and content to the law enforcement agency (LEA).
DHN 9-18-07 25
The call data (known as Intercept Related Information or IRI in Europe and Call Data or
CD in the US) consists of information about the targeted communications, including
destination of a voice call (e.g., called party’s telephone number), source of a call
(caller’s phone number), time of the call, duration, etc. Call content is namely the stream
of data carrying the call. Included in the architecture is the lawful interception
management function, which covers interception session set-up and tear down,
scheduling, target identification, etc. Communications between the network operator and
LEA are via the Handover Interfaces (designated HI). Communications data and content
are typically delivered from the network operator to the LEA in an encrypted format over
an IP-based VPN. The interception of traditional voice calls still often relies on the
establishment of an ISDN channel that is set up at the time of the interception.
As stated above, the ETSI architecture is equally applicable to IP-based services where
IRI (or CD) is dependent on parameters associated with the traffic from a given
application to be intercepted. For example, in the case of email IRI would be similar to
the header information on an email message (e.g., destination email address, source email
address, time email was transmitted) as well as pertinent header information within the IP
packets conveying the message (e.g., source IP address of email server originating the
email message). Of course, more in-depth information would be obtained by the
interception system so as to avoid the usual email address spoofing that often takes place
(e.g., spoofing of source address). Voice-over-IP likewise has its own IRI, including data
derived from Session Initiation Protocol (SIP) messages that are used to set up and tear
down a VOIP call.
USA interception standards that help network operators and service providers conform to
CALEA are mainly those specified by the CableLabs, the Alliance for
Telecommunications Industry Solutions (ATIS), and the TIA. TIA's standards include J-
STD-025B which updates the earlier J-STD-025A to include packetized voice and
CDMA wireless interception, although it has recently been challenged as "deficient" by
the U.S. Dept of Justice. Generic global standards have also been developed by the
[http://www.ietf.org Internet Engineering Task Force (IETF) that provides a front-end
means of supporting most LI handover standards. Although the terms are different, the
concepts behind the interception architecture resemble those formulated under ETSI.
More recent standards address packetized voice and data (e.g., ETSI TS102232 et seq,
ATIS T1.678, T1.IAS) and interception for PacketCable. Interception standardization
efforts for wireless networks are primarily overseen by the Third Generation Partnership
Various countries have different rules with regards to lawful interception. In the United
Kingdom the law is known as RIPA (Regulation of Investigatory Powers Act), in United
DHN 9-18-07 26
States there is an array of federal and state criminal law, in Commonwealth of
Independent States countries as SORM. A subset of LI law deals with the the ability of
communication providers to support interception handovers.
United States of America
In the United States, two laws cover most of the governance of lawful interception. The
1968 Omnibus Crime Control and Safe Streets Act, Title III pertains mainly to lawful
interception criminal investigations. The second law, the 1978 Foreign Intelligence
Surveillance Act, or FISA, governs wiretapping for intelligence purposes where the
subject of the investigation must be a foreign (non-US) national or a person working as
an agent on behalf of a foreign country. Most of the congressionally mandated wiretap
records indicate that the cases are related to illegal drug distribution, with cell phones as
the dominant form of intercepted communication.
During the 1990s, to help law enforcement and the FBI more effectively carry out
wiretap operations, especially in view of the emerging digital voice and wireless
networks at the time, the US Congress passed CALEA in 1994 . This act provides broad
guidelines to network operators on how to assist the LEAs in setting up interceptions and
the types of data to be delivered. CALEA does not, as many believe, provide specific
implementation directives on interception. More recently, the US Federal
Communications Commission (FCC) mandated that CALEA be extended to include
interception of publicly-available broadband networks and Voice over IP services that are
interconnected to the Public Switched Telephone Network (PSTN).
As a response to the terrorist events of 9/11, the US Congress incorporated various
provisions related to enhanced electronic surveillance in the “Uniting and Strengthening
America by Providing Appropriate Tools Required to Intercept and Obstruct Terrorism”
Act (USA Patriot Act). These wiretap provisions are mainly updates to those expressed
under the FISA law.
In the European Union, the European Council Resolution of 17 January 1995 on the
Lawful Interception of Telecommunications (Official Journal C 329) mandated similar
DHN 9-18-07 27
measures to CALEA on a pan-European basis. Although some EU member countries
reluctantly accepted this resolution out of privacy concerns (which are more pronounced
in Europe than the US), there appears now to be general agreement with the resolution.
Interestingly enough, interception mandates in Europe are generally more rigorous than
those of the US; for example, both voice and ISP public network operators in the
Netherlands have been required to support interception capabilities for years.
Most countries worldwide maintain LI requirements similar to those in the US and
Europe, and have moved to the ETSI handover standards. The, for example, collaboration
through the numerous ISS World forums.
As with many law enforcement tools, LI systems may be subverted for illicit purposes.
This occurred in Greece during the 2004 Olympics, the telephone operator concerned was
fined US$1,000,000 in 2006 for failing to secure it's systems against hacking.
1. ^ http://www.askcalea.com
2. ^ http://europa.eu.int/eur-lex/lex/LexUriServ/LexUriServ.do?
3. ^ http://news.bbc.co.uk/1/hi/business/6182647.stm
DHN 9-18-07 28
• Handover Interface for the Lawful Interception of Telecommunications Traffic,
ETSI ES-201-671, under Lawful Interception, Telecommunications Security,
version 3.1.1, May 2007.
• Handover Specification for IP delivery, ETSI TS-102-232-1, under Lawful
Interception, Telecommunications Security, version 2.1.1, December 2006.
• Lawfully Authorized Electronic Surveillance, T1P1/T1S1 joint standard,
document number J-STD-025B, December 2003.
• 3rd Generation Partnership Project, Technical Specification 3GPP TS 33.106
V5.1.0 (2002-09), “Lawful Interception Requirements (Release 5),” September
• 3rd Generation Partnership Project, Technical Specification 3GPP TS 33.107
V6.0.0 (2003-09), “Lawful interception architecture and functions (Release 6),”
• 3rd Generation Partnership Project, Technical Specification 3GPP TS 33.108
V6.3.0 (2003-09), “Handover interface for Lawful Interception (Release 6),”
• PacketCable Electronic Surveillance Specification, PKT-SP-ESP-I03-040113,
Cable Television Laboratories Inc., 13 January 2004.
• T1.678, Lawfully Authorized Electronic Surveillance (LAES) for Voice over
Packet Technologies in Wireline Telecommunications Networks.
DHN 9-18-07 29
In computing, Media Gateway Control Protocol (MGCP) is a protocol used within a
distributed Voice over IP system.
MGCP is defined in RFC 3435, which obsoletes an earlier definition in RFC 2705. It
superseded the Simple Gateway Control Protocol (SGCP).
Another protocol for the same purpose is Megaco, a co-production of IETF (RFC 3525)
and ITU (Recommendation H.248-1). Both protocols follow the guidelines of the API
Media Gateway Control Protocol Architecture and Requirements at RFC 2805.
• 1 Architecture
• 2 Protocol Overview
• 3 Implementations
• 4 RFCs
• 5 See also
The distributed system is composed of a Call Agent (or Media Gateway Controller), at
least one Media Gateway (MG) that performs the conversion of media signals between
circuits and packets, and at least one Signaling Gateway (SG) when connected to the
The Call Agent uses MGCP to tell the Media Gateway:
• what events should be reported to the Call Agent
• how endpoints should be connected together
• what signals should be played on endpoints.
MGCP also allows the Call Agent to audit the current state of endpoints on a Media
The Media Gateway uses MGCP to report events (such as off-hook, or dialed digits) to
the Call Agent.
(While any Signaling Gateway is usually on the same physical switch as a Media
Gateway, this needn't be so. The Call Agent does not use MGCP to control the Signaling
Gateway; rather, SIGTRAN protocols are used to backhaul signaling between the
Signaling Gateway and Call Agent).
DHN 9-18-07 30
In MGCP, every command has a transaction ID and receives a response.
Typically, a Media Gateway is configured with a list of Call Agents from which it may
accept programming (where that list normally comprises only one or two Call Agents). In
principle, event notifications may be sent to different Call Agents for each endpoint on
the gateway (as programmed by the Call Agents, by setting the NotifiedEntity
parameter). In practice however, it is usually desirable that at any given moment all
endpoints on a gateway should be controlled by the same Call Agent; other Call Agents
are available only to provide redundancy in the event that the primary Call Agent fails, or
loses contact with the Media Gateway. In the event of such a failure it is the backup Call
Agent's responsibility to reprogram the MG so that the gateway comes under the control
of the backup Call Agent. Care is needed in such cases; two Call Agents may know that
they have lost contact with one another, but this does not guarantee that they are not both
attempting to control the same gateway. The ability to audit the gateway to determine
which Call Agent is currently controlling can be used to resolve such conflicts.
MGCP assumes that the multiple Call Agents will maintain knowledge of device state
among themselves (presumably with an unspecified protocol) or rebuild it if necessary (in
the face of catastrophic failure). Its failover features take into account both planned and
MGCP packets are unlike what you find in many other protocols. Usually wrapped in
UDP port 2427, the MGCP datagrams are formatted with whitespace, much like you
would expect to find in TCP protocols. An MGCP packet is either a command or a
Commands begin with a four-letter verb. Responses begin with a three number response
There are eight (8) command verbs:
AUEP, AUCX, CRCX, DLCX, MDCX, NTFY, RQNT, RSIP
Two verbs are used by a Call Agent to query (the state of) a Media Gateway:
AUEP - Audit Endpoint
AUCX - Audit Connection
DHN 9-18-07 31
Three verbs are used by a Call Agent to manage an RTP connection on a Media Gateway
(a Media Gateway can also send a DLCX when it needs to delete a connection for its
CRCX - Create Connection
DLCX - Delete Connection
MDCX - Modify Connection
One verb is used by a Call Agent to request notification of events on the Media Gateway,
and to request a Media Gateway to apply signals:
RQNT - Request for Notification
One verb is used by a Media Gateway to indicate to the Call Agent that it has detected an
event for which the Call Agent had previously requested notification of (via the RQNT
NTFY - Notify
One verb is used by a Media Gateway to indicate to the Call Agent that it is in the
process of restarting:
RSIP - Restart In Progress
• Vovida MGCP
• RFC 3435 - Media Gateway Control Protocol (MGCP) Version 1.0 (this
supersedes RFC 2705)
• RFC 3660 - Basic Media Gateway Control Protocol (MGCP) Packages
• RFC 3661 - Media Gateway Control Protocol (MGCP) Return Code Usage
• RFC 3064 - MGCP CAS Packages
• RFC 3149 - MGCP Business Phone Packages
• RFC 3991 - Media Gateway Control Protocol (MGCP) Redirect and Reset
• RFC 3992 - Media Gateway Control Protocol (MGCP) Lockstep State Reporting
• RFC 2805 - Media Gateway Control Protocol Architecture and Requirements
DHN 9-18-07 32
MGCP Information Site
MGCP was originally developed to address the need of scaling ingress and egress
gateways in order to meet the demands of service providers. MGCP utilizes SDP for
negotiating the media streams transmitted and received on the packet network, which
significantly reduces the interworking complexity between SIP-based media gateway
controllers (or "call agents") and media gateways. MGCP was published by the IETF as
Informational RFCs (shown on this page) and also standardized by the ITU-T and
adopted for use within cable networks. MGCP is widely deployed around the world.
Core Documents (IETF)
RFC 2705 Media Gateway Control Protocol 1.0 (obsolete)
RFC 3435 Media Gateway Control Protocol 1.0
RFC 3660 Basic Media Gateway Control Protocol Packages
RFC 3661 MGCP Return Code Usage
RFC 2897 Proposal for an MGCP Advanced Audio Package
RFC 3064 MGCP CAS Packages
RFC 3149 MGCP Business Phone Packages
RFC 3441 Asynchronous Transfer Mode (ATM) Package
RFC 3624 Bulk Audit Package
RFC 3991 Redirect and Reset Package
RFC 3992 Lockstep State Reporting Mechanism
Please also refer to these IANA pages:
MGCP Package Registry
MGCP LocalConnectionOptions Sub-registry
DHN 9-18-07 33
1.0 Specs Network-Based Call Signaling Protocol Specification (NCS)
PSTN Gateway Call Signaling Protocol Specification (TGCP)
NCS Signaling MIB Specification
NCS Basic Packages
... Complete List of PacketCable 1.0 Specs ...
1.5 Specs Network-Based Call Signaling Protocol Specification (NCS)
PSTN Gateway Call Signaling Protocol Specification (TGCP)
Audio Server Package
NCS Signaling MIB Specification
... Complete List of PacketCable 1.5 Specs ...
PacketCable Specifications Home Page
Network call signalling protocol for the delivery of time-critical services
over cable television networks using cable modems
J.169 IPCablecom network call signalling (NCS) MIB requirements
J.171 IPcablecom trunking gateway control protocol (TGCP)
J.175 Audio server protocol
SCTE Network Call Signaling Protocol for the Delivery of Time-Critical
24-3 Services over Cable Television Using Data Modems
SCTE IPCablecom Part 8: Network Call Signaling Management Information
24-8 Base (MIB) Requirements
IPCablecom Part 12: Trunking Gateway Control Protocol (TGCP)
DHN 9-18-07 34
Packet Loss Concealment (PLC) is a technique to mask the effects of packet loss in
VoIP communications. Because the voice signal is sent as packets on a VoIP network,
they may travel different routes to get to destination. At the receiver a packet might arrive
very late, corrupted or simply might not arrive. One of the situations in which the latter
could happen is where a packet is rejected by a server which has a full buffer and cannot
accept any more data. In a VoIP connection, error control techniques such as ARQ are
not feasible and the receiver should be able to cope with packet loss. Some of PLC
• zero insertion: the lost speech frames are replaced with zero
• waveform substitution: the missing gap is reconstructed by repeating a portion of
already received speech. The simplest form of this would be to repeat the last
received frame. Other techniques account for Fundamental frequency, gap
duration, etc. Waveform substitution methods are popular because of their
simplicity to understand and implement. An example of such algorithm is
proposed in ITU recommendation G.711 Appendix I.
• model based methods: increasing number of algorithms that take advantage of
speech models of interpolating and extrapolating speech gaps are being
introduced and developed.
DHN 9-18-07 35
PacketCable is a project started by CableLabs. The purpose of the organization is to
define standards for the Cable TV industry.
CableLabs leads this initiative for interoperable interface specifications in order to deliver
real-time multimedia services over two-way cable networks. Built on top of the
industry’s DOCSIS 1.1 (Data Over Cable Service Interface Specifications) cable modem
infrastructure, PacketCable networks use Internet Protocol (IP) to enable a wide range of
multimedia services, such as IP telephony, multimedia conferencing, interactive gaming,
and general multimedia applications. A DOCSIS 1.1 network with PacketCable
extensions enables cable operators to deliver data and voice traffic efficiently using a
single high-speed, quality-of-service (QoS)-enabled broadband (cable) architecture.
The PacketCable effort dates back to 1997 when cable operators identified the need for a
real-time multimedia architecture to support the delivery of advanced multimedia
services over the DOCSIS 1.1 architecture.
• 1 Technical overview
o 1.1 PacketCable interconnects 3 networks
o 1.2 PacketCable Protocols
o 1.3 PacketCable Voice Codecs per PacketCable Codec Specifications
o 1.4 PacketCable 1.0
o 1.5 PacketCable 1.5
o 1.6 PacketCable 2.0
• 2 Deployment
• 3 References
• 4 External links
• 5 Further reading
PacketCable interconnects 3 networks
• Hybrid Fibre Coaxial (HFC) Access Network
• Public Switched Telephone Network (PSTN)
• TCP/IP Managed IP Networks
DHN 9-18-07 36
• DOCSIS (Data Over Cable Service Interface Specification) - standard for data
over cable and details mostly the RF band
• Real-time Transport Protocol (RTP) & Real Time Control Protocol (RTCP)
required for media transfer
• PSTN Gateway Call Signaling Protocol Specification (TGCP) which is an MGCP
extension for Media Gateways
• Network-Based Call Signaling Protocol Specification (NCS) which is an MGCP
extension for analog residential Media Gateways - the NCS specification, which
is derived from the IETF MGCP RFC 2705, details VoIP signalling.
o Basically the IETF version is a subset of the NCS version. The Packet
Cable group has defined more messages and features than the IETF.
• Common Open Policy Service (COPS) for Quality of Service
PacketCable Voice Codecs per PacketCable Codec Specifications
o ITU G.711 (both µ-law and A-law versions) - for V1.0 & 1.5
o iLBC - for V1.5
o BV16 - for V1.5
o ITU G.728
o ITU G.729 Annex E
• PacketCable 1.0 comprises eleven specifications and six technical reports which
define the call signaling, Quality of Service (QoS), Codec, client provisioning,
billing event message collection, PSTN (Public Switched Telephone Network)
interconnection, and security interfaces necessary to implement a single-zone
PacketCable solution for residential Internet Protocol (IP) voice services.
• PacketCable 1.5 contains additional capabilities that do not exist in PacketCable
1.0, and superseded previous versions (1.1, 1.2, and 1.3).
• PacketCable 1.5 comprises 21 specifications and one technical report which
together define the call signaling, Quality of Service (QoS), Codec, client
provisioning, billing event message collection, PSTN (Public Switched Telephone
DHN 9-18-07 37
Network) interconnection, and security interfaces necessary to implement a
single-zone or multi-zone PacketCable solution for residential Internet Protocol
(IP) voice services.
• Version 2.0 of PacketCable will replace MGCP with SIP.
VoIP services based on PacketCable architecture are being widely deployed by operators:
• Videotron - "VoIP services" (Canada: Quebec)
• Time Warner - Digital Phone (System wide)
• Cablevision – Optimum Voice (System wide)
• Comcast - Comcast Digital Voice (System-wide)
• Cox – Cox Digital Telephone (System-wide)
• Charter (St. Louis, Wisconsin)
• Bright House Networks (Florida)
• Liberty Cablevision (Puerto Rico)
• GCI (Alaska)
• Shaw - "Shaw Digital Phone" (Canada: Calgary, Edmonton, Winnipeg and
• "BRAGATEL" / Bragatel (Braga, Portugal)
• "TVCABO" / PT Multimedia (Portugal)
• Rogers - Rogers Home Phone (Canada wide (Major cities and towns
serviceable with rogers high-speed internet are eligable, still expanding, St John's,
NL to Vancouver, BC Serviceable as of July 2007))
• Bresnan Communications - Bresnan Digital Phone (System wide)
• CableOne - "CableONE.net" (System wide)
• Casema - "Casema Telefonie" (The Netherlands)
DHN 9-18-07 38
• PacketCable™ 1.5 Specifications Audio/Video Codecs - PKT-SP-CODEC1.5-
• PacketCable™ 1.5 Specifications Network-Based Call Signaling Protocol - PKT-
SP-NCS1.5-I01-050128 (see external link for MGCP information)
• PSTN Gateway Call Signaling Protocol Specification - PKT-SP-TGCP1.5-
I01-050128 (see external link for MGCP information)
DHN 9-18-07 39
Purple minutes in internet communications refers to IP network traffic that has a value-
added component, e.g. voice, video etc.
DHN 9-18-07 40
The Real-time Transport Protocol (or RTP) defines a standardized packet format for
delivering audio and video over the Internet. It was developed by the Audio-Video
Transport Working Group of the IETF and first published in 1996 as RFC 1889 which
was made obsolete in 2003 by RFC 3550. Real time transport protocol can also be used
in conjunction with RSVP protocol which enhances the field of multimedia applications.
RTP does not have a standard TCP or UDP port on which it communicates. The only
standard that it obeys is that UDP communications are done via an even port and the next
higher odd port is used for RTP Control Protocol (RTCP) communications. Although
there are no standards assigned, RTP is generally configured to use ports 16384-32767.
RTP can carry any data with real-time characteristics, such as interactive audio and
video. Call setup and tear-down is usually performed by the SIP protocol. The fact that
RTP uses a dynamic port range makes it difficult for it to traverse firewalls. In order to
get around this problem, it is often necessary to set up a STUN server.
It was originally designed as a multicast protocol, but has since been applied in many
unicast applications. It is frequently used in streaming media systems (in conjunction
with RTSP) as well as videoconferencing and push to talk systems (in conjunction with
H.323 or SIP), making it the technical foundation of the Voice over IP industry. It goes
along with the RTCP and it's built on top of the User Datagram Protocol (UDP).
Applications using RTP are less sensitive to packet loss, but typically very sensitive to
delays, so UDP is a better choice than TCP for such applications.
According to RFC 1889, the services provided by RTP include:
• Payload-type identification - Indication of what kind of content is being carried
• Sequence numbering - PDU sequence number
• Time stamping - allow synchronization and jitter calculations
• Delivery monitoring
The protocols themselves do not provide mechanisms to ensure timely delivery. They
also do not give any Quality of Service (QoS) guarantees. These things have to be
provided by some other mechanism.
Also, out of order delivery is still possible, and flow and congestion control are not
supported directly. However, the protocols do deliver the necessary data to the
application to make sure it can put the received packets in the correct order. Also, RTCP
provides information about reception quality which the application can use to make local
adjustments. For example if a congestion is forming, the application could decide to
lower the data rate.
DHN 9-18-07 41
RTP was also published by the ITU-T as H.225.0, but later removed once the IETF had a
stable standards-track RFC published. It exists as an Internet Standard (STD 64) defined
in RFC 3550 (which obsoletes RFC 1889). RFC 3551 (STD 65) (which obsoletes RFC
1890) defines a specific profile for Audio and Video Conferences with Minimal Control.
RFC 3711 defines the Secure Real-time Transport Protocol (SRTP) profile (actually an
extension to RTP Profile for Audio and Video Conferences) which can be used
(optionally) to provide confidentiality, message authentication, and replay protection for
audio and video streams being delivered.
The position of RTP in the protocol stack is somewhat strange. It was decided to put RTP
in user space and have it (normally) run over UDP. It operates as follows. The
multimedia application consists of multiple audio, video, text, and possibly other streams.
These are fed into the RTP library, which is in user space along with the application. This
library then multiplexes the streams and encodes them in RTP packets, which it then
stuffs into a socket. At the other end of the socket (in the operating system kernel), UDP
packets are generated and embedded in IP packets. If the computer is on an Ethernet, the
IP packets are then put in Ethernet frames for transmission. As a consequence of this
design, it is a little hard to say which layer RTP is in. Since it runs in user space and is
linked to the application program, it certainly looks like an application protocol. On the
other hand, it is a generic, application-independent protocol that just provides transport
facilities, so it also looks like a transport protocol. Probably the best description is that it
is a transport protocol that is implemented in the application layer.
DHN 9-18-07 42
• 1 Packet structure
• 2 Potential further development of RTP & RTCP
• 3 Mathematical background
• 4 Structure of RTP/RTCP applications
• 5 See also
• 6 References
• 7 External links
o 7.1 RFCs
+ Bits 0-1 2 3 4-7 8 9-15 16-31
0 Ver. P X CC M PT Sequence Number
64 SSRC identifier
96 ... CSRC identifiers ...
96+(CC×32) Extension header (optional).
Ver. (2 bits) indicates the version of the protocol. Current version is 2. P (one bit) is used to indicate if there
are extra padding bytes at the end of the RTP packet. X (one bit) indicates if the extensions to the protocol
are being used in the packet. CC (four bits) contains the number of CSRC identifiers that follow the fixed
header. M (one bit) is used at the application level and is defined by a profile. If it's set, it means that the
current data has some special relevance for the application. PT (7 bits) indicates the format of the payload
and determines its interpretation by the application. SSRC indicates the synchronization source. The
optional (see X) extension's header indicates the length of the extension (EHL=extension header length) in
32bit units. Excluding the 32 of the extension header.
DHN 9-18-07 43
Potential further development of RTP & RTCP
The Real-time Transport Protocol (RTP) and the Real-time Transport Control Protocol
(RTCP) are commonly used together. RTP is used to transmit data (e.g. audio and video)
and RTCP is used to monitor QoS. The monitoring of quality of service is very important
for modern applications. In large scale applications (e.g. IPTV), there is an unacceptable
delay between RTCP reports, which can cause quality of service related problems.
For more information read about problems and potential further development of RTCP
The equations for RTCP protocol are explained in section I. and II.A in the Optimization
of Large-Scale RTCP Feedback Reporting in Fixed and Mobile Networks paper.
Structure of RTP/RTCP applications
RTP/RTCP protocols are commonly used to transport audio or audio/video data. Separate
sessions are used for each media content (e.g. audio and video). The main advantage of
this separation is to make it possible to receive only one part of the transmission,
commonly audio data, which lowers the total bandwidth.
• Real time control protocol
• Real Time Streaming Protocol (RTSP)
• Secure Real-time Transport Protocol
• Stream Control Transmission Protocol
• Henning Schulzrinne and Stephen Casner. RTP: A Transport Protocol for Real-
Time Applications. (1993) Internet Engineering Task Force, Internet Draft,
October 20, 1993. The memo originating RTP; only an early draft, does not
describe the current standard.
• Perkins, Colin (2003). RTP: Audio and Video for the Internet (1st ed.) Addison-
Wesley. ISBN 0-672-32249-8
DHN 9-18-07 44
STUN (Simple Traversal of UDP (User Datagram Protocol) through NATs
(Network Address Translators)) is a network protocol allowing a client behind a NAT
(or multiple NATs) to find out its public address, the type of NAT it is behind and the
internet-side port associated by the NAT with a particular local port. This information is
used to set up UDP communication between two hosts that are both behind NAT routers.
The protocol is defined in RFC 3489.
• 1 Protocol overview
• 2 Algorithm
• 3 See also
• 4 External links
o 4.1 Implementations
STUN is a client-server protocol. A VoIP phone or software package may include a
STUN client, which will send a request to a STUN server. The server then reports back to
the STUN client what the public IP address of the NAT router is, and what port was
opened by the NAT to allow incoming traffic back in to the network.
The response also allows the STUN client to determine what type of NAT is in use, as
different types of NATs handle incoming UDP packets differently. It will work with three
of four main types: Full Cone, Restricted Cone, and Port Restricted Cone. (In the case of
Restricted Cone or Port Restricted Cone NATs, the client must send out a packet to the
endpoint before the NAT will allow packets from the endpoint through to the client.)
STUN will not work with Symmetric NAT (also known as bi-directional NAT) which is
often found in the networks of large companies. With Symmetric NAT, the IP address of
the STUN server is different than that of the endpoint, and therefore the NAT mapping
the STUN server sees is different than the mapping that the endpoint would use to send
packets through to the client. For details on the different types of NAT, see network
Once a client has discovered its external addresses, it can relate it to its peers. If the
NATs are full cone then either side can initiate communication. If they are restricted cone
or restricted port cone both sides must start transmitting together.
Note that using the techniques described in the STUN RFC does not necessarily require
using the STUN protocol; they can be used in the design of any UDP protocol.
DHN 9-18-07 45
Protocols like SIP use UDP packets for the transfer of sound/video/text signaling traffic
over the Internet. Unfortunately as both endpoints are often behind NAT, a connection
cannot be set up in the traditional way. This is where STUN is useful.
The STUN server is contacted on UDP port 3478, however the server will hint clients to
perform tests on alternate IP and port number too (STUN servers have two IP addresses).
The RFC states that this port and IP are arbitrary.
STUN uses the following algorithm (adapted from RFC 3489) to discover the presence of
NAT gateways and firewalls:
Where the path through the diagram ends in a red box, UDP communication is not
possible. Where the path ends in a yellow or green box, communication is possible.
DHN 9-18-07 46
A Session Border Controller is a device used in some VoIP networks to exert control
over the signaling and usually also the media streams involved in setting up, conducting,
and tearing down calls.
Within the context of VoIP, the word "Session" in Session Border Controller refers to a
call. Each call consists of one or more call signaling streams that control the call, and one
or more call media streams which carry the call's audio, video, or other data along with
information concerning how that data is flowing across the network. Together, these
streams make up a session, and it is the job of a Session Border Controller to exert
influence over the data streams that make up one or more sessions.
The word "Border" in Session Border Controller refers to a point of demarcation between
one part of a network and another. As a simple example, at the edge of a corporate
network, a firewall demarcs the local network (inside the corporation) from the rest of the
Internet (outside the corporation). A more complex example is that of a large corporation
where different departments have security needs for each location and perhaps for each
kind of data. In this case, filtering routers or other network elements are used to control
the flow of data streams. It is the job of a Session Border Controller to assist policy
administrators in managing the flow of session data across these borders.
The word "Controller" in Session Border Controller refers to the influence that Session
Border Controllers have on the data streams that comprise Sessions, as they traverse
borders between one part of a network and another. Additionally, Session Border
Controllers often provide measurement, access control, and data conversion facilities for
the calls they control.
DHN 9-18-07 47
• 1 Theory of operation
• 2 Controversy
• 3 Lawful Intercept and CALEA
• 4 History and market
• 5 References
• 6 External links
Theory of operation
SBCs are inserted into the signaling and/or media paths between calling and called
parties in a VoIP call, predominantly those using the SIP, H.323, and MGCP call
In some cases, the SBC acts as if it were the called VoIP phone and places a second call
to the called party. In technical terms, when used within the SIP protocol, this is defined
as being a Back-to-Back User-Agent, or B2BUA. The effect of this behavior is that not
only the signaling traffic, but also the media traffic (voice, video etc) can be controlled by
the SBC. SBCs also make it possible to redirect media traffic to a completely different
element elsewhere in the network, perhaps for recording, generation of music-on-hold, or
other media-related purposes. Without an SBC, the media traffic travels directly between
the VoIP phones, without the in-network call signaling elements having control over their
However, in other cases, the SBC simply modifies the stream of call control (signaling)
data involved in each call, perhaps limiting the kinds of call that can be conducted,
changing the codec choices, and so on. Ultimately, SBCs allow their owners to control
the kinds of calls that can be placed through the networks on which they reside, fix or
change protocols and protocol syntax to achieve interoperability, and also overcome
some of the problems that firewalls and NAT cause for VoIP calls.
DHN 9-18-07 48
SBCs are often used by corporations along with firewalls to enable VoIP calls to and
from a protected enterprise network. VoIP service providers use SBCs to allow the use of
VoIP protocols from private networks with internet connections using NAT, and also to
implement strong security measures that are necessary to maintain a high quality of
service. SBCs also perform the function of application-level gateways.
Additionally, some SBCs can also allow VoIP calls to be set up between two phones
using different VoIP signaling protocols (SIP, H.323, Megaco/MGCP, etc...) as well as
performing transcoding of the media stream when different codecs are in use. Many
SBCs also provide firewall features for VoIP traffic (denial of service protection, call
filtering, bandwidth management, etc...).
In contrast to conventional phone systems, the OSI layers of a VoIP-based network need
not be operated by a single company. A VoIP user may purchase their internet access
from one internet service provider and their VoIP service from a second company.
From an IMS architecture perspective, the SBC is the integration of the P-CSCF and C-
BGF functions on the access side, and the I-BCF, IWF, and I-BGF functions on the
peering side. Some SBCs can be "decomposed", meaning the signaling functions can be
on a separate hardware platform than the media relay functions - in other words the P-
CSCF can be separated from the C-BGF, or the I-BCF/IWF can be separated from the I-
BGF functions physically. A proprietary or standards based protocol, such as the H.248
Ia profile, can be used by the signaling platform to control the media one.
DHN 9-18-07 49
The concept of SBC is controversial to proponents of end-to-end systems and peer-to-
peer networking in consideration of the following:
• SBCs can extend the length of the media path (the way of media packets through
the network) significantly. A long media path is undesirable, as it increases the
delay of voice packets (especially if the SBC implements transcoding) and the
probability of packet loss. Both effects deteriorate the voice/video quality.
However, sometimes there are obstacles to communication such as firewalls
between the call parties, and in these cases SBCs can be used to guide media
streams towards an acceptable path between caller and callee, whereas without the
SBC the call media would be blocked. Some SBCs can detect if the ends of the
call are in the same subnetwork and release control of the media enabling it to
flow directly between the clients, this is anti-tromboning. Also, some SBCs can
create a media path where none would otherwise be allowed to exist (by virtue of
various firewalls and other security apparatus between the two endpoints). Lastly,
for specific VoIP network models where the service provider owns the network,
SBCs can actually decrease the media path by shortcut routing approaches.
• SBCs often restrict the flow of information between call endpoints, restricting
end-to-end transparency. VoIP phones may not be able to use new protocol
features unless they are understood by the SBC. However, some SBCs are more
able than others to cope with previously unseen and unanticipated protocol
features. End-to-End encryption can't be used if the SBC does not have the key,
DHN 9-18-07 50
although some portions of the information stream in an encrypted call are not
encrypted, and those portions can be used and influenced by the SBC. Some
SBCs are able to offload this encryption function from other elements in the
network by terminating SIP-TLS, IPSec, and/or SRTP. Furthermore, some SBCs
can actually make calls and other SIP scenarios work when they couldn't have
before, by performing specific protocol "normalization" or "fix-up".
• In some cases, far-end or hosted NAT traversal can be done without SBCs if the
VoIP phones support protocols like STUN, TURN, ICE, or Universal Plug and
Play (UPnP). To date STUN, TURN, ICE and others have not seen wide
deployment, and their complexity leaves much to be desired.
Most of the controversy surrounding SBCs pertains to whether call control should remain
solely with the two endpoints in a call (in service to their owners), or should rather be
shared with other network elements owned by the organizations managing various
networks involved in connecting the two call endpoints. For example, should call control
remain with Alice and Bob (two callers), or should call control be shared with the
operators of all the IP networks involved in connecting Alice and Bob's VoIP phones
together. The debate of this point is vigorous, almost religious, in nature. Those who want
control in the endpoints only, are greatly frustrated by the various realities of today's
networks, such as firewalls, filtering/throttling, and the lack of adoption of a universal
VoIP equivalent to the phone number. Those who want control in the middle of the call
end-points, are typically trying to replicate the old-style phone system, where virtually all
control rested with the service provider. So far, these views have not proven to be
reconcilable. Note that it may be required for a third call control element such as an SBC
to be inserted in between the two endpoints in order to satisfy local lawful interception
Lawful Intercept and CALEA
An SBC may provide session media (normally RTP) and signalling (normally SIP)
wiretap services, which can be used by providers to enforce requests for the lawful
interception of network sessions. Standards for the interception of such services are
provided by CALEA and ETSI, among others.
DHN 9-18-07 51
History and market
The history of SBCs shows that several corporations were involved in creating and
popularizing the SBC market segment for carriers and enterprises. The "big six" of
carrier-oriented SBC companies are (or were, since several have been acquired or are
defunct): Acme Packet (NASDAQ: APKT), Kagoor Networks (acquired in 2005 by
Juniper Networks and later end-of-lifed), Jasomi Networks (acquired in 2005 by Ditech
Communications which is now known as Ditech Networks), Netrake (acquired in 2006
by Audiocodes), NexTone, and Aravox (acquired in 2003 by Alcatel and terminated).
According to Jonathan Rosenberg, the author of RFC 3261 (SIP) and numerous other
related RFCs, Dynamicsoft actually developed the first working SBC in conjunction with
Aravox, but the product never truly gained marketshare. Four companies also played a
major role in delivering enterprise-oriented SBCs: Jasomi Networks with its PeerPoint
product line, Edgewater, Borderware, and Ingate.
During the evolution of SBCs, many other companies undertook software development
programs to create SBCs. However, doing so turned out to be a far greater technical
challenge than most had anticipated, and there were few successes. An even larger group
of companies began to remarket their existing products as SBCs when it became clear
that the SBC market was "hot" with respect to acquisitions and IPOs.
Of these companies, Acme Packet is the market segment leader, and is the only company
of the group to have had a successful IPO. With the field narrowed by acquisition,
NexTone is generally considered to be in second place, although they traditionally target
a different market segment, having started life as a softswitch vendor.
1. ^ Internet Communication Using SIP (p 180), Henry Sinnreich & Alan B. Johnston,
The Session Initiation Protocol (SIP) is an application-layer control (signaling)
protocol for creating, modifying, and terminating sessions with one or more participants.
It can be used to create two-party, multiparty, or multicast sessions that include Internet
telephone calls, multimedia distribution, and multimedia conferences. (cit. RFC 3261).
SIP is designed to be independent of the underlying transport layer; it can run on TCP,
UDP, or SCTP. It was originally designed by Henning Schulzrinne (Columbia
University) and Mark Handley (UCL) starting in 1996. The latest version of the
specification is RFC 3261 from the IETF SIP Working Group. In November 2000, SIP
was accepted as a 3GPP signaling protocol and permanent element of the IMS
architecture. It is widely used as a signaling protocol for Voice over IP, along with H.323
SIP has the following characteristics:
DHN 9-18-07 52
• Transport-independent, because SIP can be used with UDP, TCP, ATM & so on.
• Text-based, allowing for humans to read SIP messages.
• 1 Protocol design
• 2 SIP network elements
• 3 Instant messaging (IM) and presence
• 4 Commercial applications
• 5 See also
• 6 External links
SIP clients use TCP or UDP (typically on port 5060) to connect to SIP servers and other
SIP endpoints. SIP is primarily used in setting up and tearing down voice or video calls.
However, it can be used in any application where session initiation is a requirement.
These include Event Subscription and Notification, Terminal mobility and so on. There
are a large number of SIP-related RFCs that define behavior for such applications. All
voice/video communications are done over separate session protocols, typically RTP.
A motivating goal for SIP was to provide a signalling and call setup protocol for IP-based
communications that can support a superset of the call processing functions and features
present in the public switched telephone network (PSTN). SIP by itself does not define
these features; rather, its focus is call-setup and signalling. However, it has been designed
to enable the building of such features in network elements known as Proxy Servers and
User Agents. These are features that permit familiar telephone-like operations: dialing a
number, causing a phone to ring, hearing ringback tones or a busy signal. Implementation
and terminology are different in the SIP world but to the end-user, the behavior is similar.
SIP-enabled telephony networks can also implement many of the more advanced call
processing features present in Signalling System 7 (SS7), though the two protocols
themselves are very different. SS7 is a highly centralized protocol, characterized by a
highly complex central network architecture and dumb endpoints (traditional telephone
handsets). SIP is a peer-to-peer protocol. As such it requires only a very simple (and thus
DHN 9-18-07 53
highly scalable) core network with intelligence distributed to the network edge,
embedded in endpoints (terminating devices built in either hardware or software). SIP
features are implemented in the communicating endpoints (i.e. at the edge of the
network) as opposed to traditional SS7 features, which are implemented in the network.
Although many other VoIP signalling protocols exist, SIP is characterized by its
proponents as having roots in the IP community rather than the telecom industry. SIP has
been standardized and governed primarily by the IETF while the H.323 VoIP protocol
has been traditionally more associated with the ITU. However, the two organizations
have endorsed both protocols in some fashion.
SIP works in concert with several other protocols and is only involved in the signalling
portion of a communication session. SIP acts as a carrier for the Session Description
Protocol (SDP), which describes the media content of the session, e.g. what IP ports to
use, the codec being used etc. In typical use, SIP "sessions" are simply packet streams of
the Real-time Transport Protocol (RTP). RTP is the carrier for the actual voice or video
The first proposed standard version (SIP 2.0) was defined in RFC 2543. The protocol was
further clarified in RFC 3261, although many implementations are still using interim
draft versions. Note that the version number remains 2.0.
SIP is similar to HTTP and shares some of its design principles: It is human readable and
request-response structured. SIP shares many HTTP status codes, such as the familiar
'404 not found'. SIP proponents also claim it to be simpler than H.323. However, some
would counter that while SIP originally had a goal of simplicity, in its current state it has
become as complex as H.323. Others would argue that SIP is a stateless protocol, hence
making it possible to easily implement failover and other features that are difficult in
stateful protocols such as H.323. SIP and H.323 are not limited to voice communication
but can mediate any kind of communication session from voice to video or future,
SIP network elements
Hardware endpoints — devices with the look, feel, and shape of a traditional telephone,
but that use SIP and RTP for communication — are commercially available from several
vendors. Some of these can use Electronic Numbering (ENUM) or DUNDi to translate
existing phone numbers to SIP addresses, so calls to other SIP users can bypass the
telephone network, even though your service provider might normally act as a gateway to
the PSTN network for traditional phone numbers (and charge you for it). Today, software
SIP endpoints are common.
SIP also requires proxy and registrar network elements to work as a practical service.
Although two SIP endpoints can communicate without any intervening SIP
infrastructure, which is why the protocol is described as peer-to-peer, this approach is
DHN 9-18-07 54
impractical for a public service. There are various implementations that can act as proxy
From the RFCs:
"SIP makes use of elements called proxy servers to help route requests to the
user's current location, authenticate and authorize users for services, implement
provider call-routing policies, and provide features to users."
"SIP also provides a registration function that allows users to upload their current
locations for use by proxy servers. "
"Since registrations play an important role in SIP, a User Agent Server that
handles a REGISTER is given the special name registrar."
"It is an important concept that the distinction between types of SIP servers is
logical, not physical."
Instant messaging (IM) and presence
A standard instant messaging protocol based on SIP, called SIMPLE, has been proposed
and is under development. SIMPLE can also carry presence information, conveying a
person's willingness and ability to engage in communications. Presence information is
most recognizable today as buddy status in IM clients.
Some efforts have been made to integrate SIP-based VoIP with the XMPP specification
used by Jabber. Most notably Google Talk, which extends XMPP to support voice, plans
to integrate SIP. Google's XMPP extension is called Jingle and, like SIP, it acts as a
Session Description Protocol carrier.
SIP itself defines a method of passing instant messages between endpoints, similar to
SMS messages. This is not generally supported by commercial operators.
Firewalls typically block media packet types such as UDP, though one way around this is
to use TCP tunnelling and relays for media in order to provide NAT and firewall
traversal. One solution involves tunnelling the media packets within TCP or HTTP
packets to a relay. This solution uses additional functionality in conjunction with SIP, and
packages the media packets into a TCP stream which is then sent to the relay. The relay
then extracts the packets and sends them on to the other endpoint. If the other endpoint is
behind a symmetrical NAT, or corporate firewall that does not allow VOIP traffic, the
relay would transfer the packets to another tunnel. One disadvantage of this approach is
that TCP was not designed for real time traffic such as voice, so an optimized form of the
protocol is sometimes used.
DHN 9-18-07 55
As envisioned by its originators, SIP's peer-to-peer nature does not enable network-
provided services. For example, the network can not easily support legal interception of
calls (referred to in the United States by the law governing wiretaps, CALEA).
Emergency calls (calls to E911 in the USA) are difficult to route. It is difficult to identify
the proper Public Service Answering Point, PSAP because of the inherent mobility of IP
end points and the lack of any network location capability. However, as commercial SIP
services begin to take off practical solutions to these problems are being proven.
Standards being developed by such organizations as 3GPP and 3GPP2 define
applications of the basic SIP model which facilitate commercialization and enable
support for network-centric capabilities such as CALEA.
Many VoIP phone companies allow customers to bring their own SIP devices, as SIP-
capable telephone sets, or softphones. The new market for consumer SIP devices
continues to expand.
The free software community started to provide more and more of the SIP technology
required to build both end points as well as proxy and registrar servers leading to a
commoditization of the technology, which accelerates global adoption. SIPfoundry has
made available and actively develops a variety of SIP stacks, client applications and
SDKs, in addition to entire IP PBX solutions that compete in the market against mostly
proprietary IP PBX implementations from established vendors.
The National Institute of Standards and Technology (NIST), Advanced Networking
Technologies Division provides a public domain implementation of the JAVA Standard
for SIP JAIN-SIP which serves as a reference implementation for the standard. The stack
can work in proxy server or user agent scenarios and has been used in numerous
commercial and research projects. It supports RFC 3261 in full and a number of
extension RFCs including RFC 3265 (Subscribe / Notify) and RFC 3262 (Provisional
Reliable Responses) etc.
DHN 9-18-07 56
A Signaling Gateway is a network component solely responsible for translating
signaling messages (i. e. information about call establishment and teardown) between one
medium (usually IP) and another (PSTN). For example, a signaling gateway might
translate between ISUP and SIP. A signaling gateway is often part of a softswitch in
modern VoIP deployments.
DHN 9-18-07 57
In computing, a softphone is a software for making telephone calls over the Internet
using a general purpose computer, rather than using dedicated hardware. Often a
softphone is designed to behave like a traditional telephone, sometimes appearing as an
image of a phone, with a display panel and buttons with which the user can interact. A
softphone is usually used with a headset connected to the sound card of the PC, or with a
• 1 Softphone Applications
• 2 Communication Protocols
• 3 Softphone Features
• 4 Softphone Requirements
• 5 See also
A typical application of a softphone is to make calls via an Internet telephony service
provider to other softphone or to fixed or cell phone. Service provide may offer PC-to-PC
calls for free, but PC-to-phone and phone-to-PC calls usually are not free.
Other type of softphone connects to a private branch exchange (PBX) through a local
area network (LAN) and is used to control and dial through an existing hardware phone.
This is often used in a call center environment to make calls from a central customer
directory, and to "pop-up" information on the screen about which customer is calling,
instantly providing the operator with details of the relationship between the caller and the
company using the call center.
It's important to differentiate softphones from services based on softphones. Skype,
Google Talk, and Vonage are Internet telephony service providers having their own
softphones that you install on your computer. Unfortunately these three major providers
are not interoperable, and you can't place a direct call between them.
To communicate, both end-points must have the same communication protocol and at
least one common audio codec. Most service providers use a communication protocol
called SIP (Session Initiation Protocol) by IETF, except Skype which is a totally
proprietary system and Google Talk which is based on Jabber. Examples of SIP
softphones can be found in the category "VoIP_software" (see below) and at Comparison
of VoIP software
DHN 9-18-07 58
A typical softphone has all standard telephony features (DND, Mute, DTMF, Flash,
Hold, Transfer etc) and a lot of new ones like Presence, Video, Wideband Audio and
more. The minimum codecs set is G.711, GSM and iLBC. Softphone vendors may offer
more codecs and different feature set.
To make voice calls over Internet, you should have
• Any modern PC with a microphone and a speaker, or with a headset, or with
• Reliable Internet connectivity like DSL, WiFi, cable or LAN. 28.8 dial-up modem
may be enough if you use a codec compressing the speech to this bandwidth.
• Account with an Internet telephony service provider.
DHN 9-18-07 59
Traversal Using Relay NAT (TURN) is a protocol that allows for an element behind a
NAT or firewall to receive incoming data over TCP or UDP connections. It is most useful
for elements behind symmetric NATs or firewalls that wish to be on the receiving end of
a connection to a single peer. TURN does not allow for users to run servers on well
known ports if they are behind a NAT; it supports the connection of a user behind a NAT
to only a single peer. In that regard, its role is to provide the same security functions
provided by symmetric NATs and firewalls, but to turn the tables so that the element on
the inside can be on the receiving end, rather than the sending end, of a connection that is
requested by the client.
TURN is currently an Internet draft.
Network Address Translators (NATs), while providing many benefits, also come with
many drawbacks. The most troublesome of those drawbacks is the fact that they break
many existing IP applications, and make it difficult to deploy new ones. Guidelines have
been developed that describe how to build "NAT friendly" protocols, but many protocols
simply cannot be constructed according to those guidelines. Examples of such protocols
include multimedia applications and file sharing.
Simple Traversal of UDP Through NAT (STUN) provides one means for an application
to traverse a NAT. STUN allows a client to obtain a transport address (an IP address and
port) which may be useful for receiving packets from a peer. However, addresses
obtained by STUN may not be usable by all peers. Those addresses work depending on
the topological conditions of the network. Therefore, STUN by itself cannot provide a
complete solution for NAT traversal.
A complete solution requires a means by which a client can obtain a transport address
from which it can receive media from any peer which can send packets to the public
Internet. This can only be accomplished by relaying data though a server that resides on
the public Internet. This specification describes Traversal Using Relay NAT (TURN), a
protocol that allows a client to obtain IP addresses and ports from such a relay.
Although TURN will almost always provide connectivity to a client, it comes at high cost
to the provider of the TURN server. It is therefore desirable to use TURN as a last resort
only, preferring other mechanisms (such as STUN or direct connectivity) when possible.
To accomplish that, the Interactive Connectivity Establishment (ICE) methodology can
be used to discover the optimal means of connectivity.
DHN 9-18-07 60
Vamming is the equivalent of spamming for voice, and specifically Voice over IP. It is
unwanted or unsolicited voice calling over the Internet.
DHN 9-18-07 61
Vishing is the criminal practice of using social engineering and Voice over IP (VoIP) to
gain access to private personal and financial information from the public for the purpose
of financial reward. The term is a combination of "voice" and phishing. Vishing exploits
the public's trust in landline telephone services, which have traditionally terminated in
physical locations which are known to the telephone company, and associated with a bill-
payer. The victim is often unaware that VoIP allows for caller ID spoofing, inexpensive,
complex automated systems and anonymity for the bill-payer. Vishing is typically used to
steal credit card numbers or other information used in identity theft schemes from
Vishing is very hard for legal authorities to monitor or trace. To protect themselves,
consumers are advised to be highly suspicious when receiving messages directing them to
call and provide credit card or bank numbers. Rather than provide any information, the
consumer is advised to contact their bank or credit card company directly to verify the
validity of the message. 
1. The criminal configures either a war dialer to call phone numbers in a given
region or accesses a legitimate voice messaging company with a list of phone
numbers stolen from a financial institution.
2. When the victim answers the call, an automated recording, often generated with a
text to speech synthesizer, is played to alert the consumer that their credit card has
had fraudulent activity or that their bank account has had unusual activity. The
message instructs the consumer to call the following phone number immediately.
The same phone number is often shown in the spoofed caller ID and given the
same name as the financial company they are pretending to represent.
3. When the victim calls the number, it is answered by automated instructions to
enter their credit card number or bank account number on the key pad.
4. Once the consumer enters their credit card number or bank account number, the
visher has the information necessary to make fraudulent use of the card or to
access the account.
5. The call is often used to harvest additional details such as security PIN, expiration
date, date of birth, etc.
DHN 9-18-07 62
A VoIP VPN combines Voice over IP and Virtual Private Network technologies to offer
a method for delivering secure voice. Because VoIP transmits digitized voice as a stream
of data, the VoIP VPN solution accomplishes voice encryption quite simply, applying
standard data-encryption mechanisms inherently available in the collection of protocols
used to implement a VPN.
The VoIP gateway-router first converts the analog voice signal to digital form,
encapsulates the digitized voice within IP packets, then encrypts the digitized voice using
IPSec, and finally routes the encrypted voice packets securely through a VPN tunnel. At
the remote site, another VoIP router decodes the voice and converts the digital voice to an
analog signal for delivery to the phone.
Security is not the only reason to pass Voice over IP through a Virtual Private Network,
however. Session Initiation Protocol, a commonly used VOIP protocol is notoriously
difficult to pass through a firewall because it uses of random port numbers to establish
connections. A VPN is one solution to avoid a firewall issue when configuring remote
VoIP clients. The VPN virtually moves users inside the same network local as the VoIP
VoIP VPN solution may be accomplished with free Open Source software by using a
Linux distribution, or BSD, as an Operating System, a VoIP Server, and an IPsec server.
Retrieved from "http://en.wikipedia.org/wiki/VoIP_VPN"
DHN 9-18-07 63
A Voice over IP (VoIP) phone is an entity used to make telephone calls over the Internet,
or to leverage network wiring within an office for carrying phone conversations to a
PBX. VoIP phones use one of several competing communication standards to send their
calls through a network. VoIP is a way of taking the current method of talking (Analog
Audio Signals) and adapting them to become Digital Data that can be transmitted over
the IP-based networks such as private LANs or the Internet.
There are many different ways this can be done using the following.
1. A computer application (often called a soft phone): a very basic way to make a call all
that is required is an internet connection, speakers, microphone (and/or headset) and a
2. ATA (analog telephony adapter): this device plugs into the existing home or analog
telephone, a computer and a LAN or Internet connection. The ATA acts as an analog-to-
digital converter, the phone is then ready to make calls using VoIP technology.
3. IP Phones: Looks identical to a regular telephone but instead of connecting to the
normal POTS phone line jack on the wall, it connects into a router or wall jack using an
RJ-45 Ethernet connector, this then becomes a fully operational phone with all software
onboard, provided by the switch or system.
Typical protocols are normed protocols like SIP and H.323, or proprietary ones as Skype.
DHN 9-18-07 64
VoIP spam is an as-yet non-existent problem which has nonetheless received a great deal
of attention from marketers and the trade press. Some pundits have taken to referring to it
as SPIT (for "Spam over Internet Telephony").
Voice over IP systems, like e-mail and other Internet applications, are susceptible to
abuse by malicious parties who initiate unsolicited and unwanted communications.
Telemarketers, prank callers, and other telephone system abusers are likely to target VoIP
systems increasingly, particularly if VoIP tends to supplant conventional telephony.
The underlying technology driving this threat is SIP (Session Initiation Protocol, IETF –
Internet Engineering Task Force, RFC 3261). This technology has received significant
support from most major telecommunication vendors, and is showing signs of becoming
the industry standard for voice, video and other interactive forms of communication such
as instant messaging and gaming.
Similar rules to today’s email systems that block unsolicited email will also prevent
unsolicited voice and video communication. This can also be compared to the way
today’s chat applications prevent unwanted users from viewing your availability or state
of presence by using “privacy” options.
SIP as the technology has been designed to support presence natively. This potentially
means that incoming callers will know your availability before even attempting to call or
make contact with you. So just like with e-mail today the benefits of communicating with
trusted parties electronically far exceeds the pitfalls of e-mail spam, particularly when
using preventative technologies to minimize the impact of the issue.
DHN 9-18-07 65
Voice chat is a modern form of communication used on the Internet. The means of
communicating with voice chat is through any of the messengers, mainly Yahoo!
Messenger, AOL Instant Messenger or Windows Live Messenger. Voice chat has led to a
significant increase in distant communications where two or more people from opposite
ends of the world can talk almost free of cost.
Rocket Messenger and AOL were among the first to offer voice chat facilities. They were
followed by Paltalk which became a quick hit. Later Yahoo! Messenger became the most
dominant voice chat service as it provided unique features. These included individual
voice chat with another person, as well as conference call type voice chat facilities
categorized in Yahoo! Rooms.
Many video games with online multiplayer allow players to communicate via voice
chatting. In 2001, Sony released the Network adapter for their PlayStation 2 video game
console, which allowed voice chatting with a headset. In 2002, Microsoft launched the
Xbox Live service, which supports voice chatting through a headset bundled with the
Xbox 360 premium package and the official starter kit. In 2005, Nintendo launched the
Nintendo Wi-Fi Connection, an online multiplayer service for both the Nintendo DS and
for the Wii. In March 2006, Metroid Prime Hunters was released, making it the first game
to allow voice chatting through the Nintendo DS's microphone. Also, Nintendo released a
Nintendo DS headset for voice chat alongside the release of Pokemon Diamond and
DHN 9-18-07 66
Voice over Internet Protocol, also called VoIP, IP Telephony, Internet telephony,
Broadband telephony, Broadband Phone and Voice over Broadband is the routing of
voice conversations over the Internet or through any other IP-based network.
Companies providing VoIP service are commonly referred to as providers, and protocols
which are used to carry voice signals over the IP network are commonly referred to as
Voice over IP or VoIP protocols. They may be viewed as commercial realizations of the
experimental Network Voice Protocol (1973) invented for the ARPANET providers.
Some cost savings are due to utilizing a single network - see attached image - to carry
voice and data, especially where users have existing underutilized network capacity that
can carry VoIP at no additional cost. VoIP to VoIP phone calls are sometimes free, while
VoIP to public switched telephone networks, PSTN, may have a cost that's borne by the
There are two types of PSTN to VoIP services: -Direct Inward Dialing (DID) and access
numbers. DID will connect the caller directly to the VoIP user while access numbers
require the caller to input the extension number of the VoIP user.
DHN 9-18-07 67
• 1 Functionality
• 2 Implementation
o 2.1 Reliability
o 2.2 Quality of Service
o 2.3 Difficulty with sending faxes
o 2.4 Emergency calls
o 2.5 Integration into global telephone number system
o 2.6 Single point of calling
o 2.7 Mobile phones & Hand held Devices
o 2.8 Security
o 2.9 Pre-Paid Phone Cards
o 2.10 Caller ID
o 2.11 VoIM
• 3 Adoption
o 3.1 Mass-market telephony
o 3.2 Corporate and telco use
o 3.3 Use in Amateur Radio
o 3.4 Click to call
• 4 Legal issues in different countries
o 4.1 IP telephony in Japan
4.1.1 Telephone number for IP telephony in Japan
• 5 Technical details
• 6 See also
• 7 References
• 8 External links
VoIP can facilitate tasks that may be more difficult to achieve using traditional networks:
• Ability to transmit more than one telephone call down the same broadband-
connected telephone line. This can make VoIP a simple way to add an extra
telephone line to a home or office.
• Many VoIP packages include PSTN features that most telcos (telecommunication
companies) normally charge extra for, or may be unavailable from your local
telco,such as 3-way calling, call forwarding, automatic redial, and caller ID.
• VoIP can be secured with existing off-the-shelf protocols such as Secure Real-
time Transport Protocol. Most of the difficulties of creating a secure phone over
traditional phone lines, like digitizing and digital transmission are already in place
DHN 9-18-07 68
with VoIP. It is only necessary to encrypt and authenticate the existing data
• VoIP is location independent, only an internet connection is needed to get a
connection to a VoIP provider; for instance call center agents using VoIP phones
can work from anywhere with a sufficiently fast and stable Internet connection.
• VoIP phones can integrate with other services available over the Internet,
including video conversation, message or data file exchange in parallel with the
conversation, audio conferencing, managing address books and passing
information about whether others (e.g. friends or colleagues) are available online
to interested parties.
Because UDP does not provide a mechanism to ensure that data packets are delivered in
sequential order, or provide Quality of Service guarantees, VoIP implementations face
problems dealing with latency and jitter. This is especially true when satellite circuits are
involved, due to long round trip propagation delay (400 milliseconds to 600 milliseconds
for geostationary satellite). The receiving node must restructure IP packets that may be
out of order, delayed or missing, while ensuring that the audio stream maintains a proper
time consistency. This functionality is usually accomplished by means of a jitter buffer.
Another challenge is routing VoIP traffic through firewalls and address translators.
Private Session Border Controllers are used along with firewalls to enable VoIP calls to
and from a protected enterprise network. Skype uses a proprietary protocol to route calls
through other Skype peers on the network, allowing it to traverse symmetric NATs and
firewalls. Other methods to traverse firewalls involve using protocols such as STUN or
• Available bandwidth
• Delay/Network Latency
• Packet loss
• Pulse dialing to DTMF translation
Many VoIP providers do not translate pulse dialing from older phones to DTMF. The
VoIP user may use a VoIP Pulse to Tone Converter, if needed.
Fixed delays cannot be controlled but some delays can be minimized by marking voice
packets as being delay-sensitive (see, for example, Diffserv).
DHN 9-18-07 69
The principal cause of packet loss is congestion, which can be controlled by congestion
management and avoidance. Carrier VoIP networks avoid congestion by means of
Variation in delay is called jitter. The effects of jitter can be mitigated by storing voice
packets in a buffer (called a play-out buffer) upon arrival, before playing them out. This
avoids a condition known as buffer underrun, in which the playout process runs out of
voice data to play because the next voice packet has not yet arrived, but increases delay
by the length of the buffer.
Common causes of echo include impedance mismatches in analog circuitry, and acoustic
coupling of the transmit and receive signal at the receiving end.
Conventional phones are connected directly to telephone company phone lines, which in
the event of a power failure are kept functioning by back-up generators or batteries
located at the telephone exchange. However, household VoIP hardware uses broadband
modems and other equipment powered by household electricity, which may be subject to
outages in the absence of an uninterruptible power supply or generator. Early adopters of
VoIP may also be users of other phone equipment, such as PBX and cordless phone
bases, that rely on power not provided by the telephone company. Even with local power
still available, the broadband carrier itself may experience outages as well. While the
PSTN has been matured over decades and is typically extremely reliable, most broadband
networks are less than 10 years old, and even the best are still subject to intermittent
outages. Furthermore, consumer network technologies such as cable and DSL often are
not subject to the same restoration service levels as the PSTN or business technologies
such as T-1 connection.
Quality of Service
Some broadband connections may have less than desirable quality. Where IP packets are
lost or delayed at any point in the network between VoIP users, there will be a
momentary drop-out of voice. This is more noticeable in highly congested networks and/
or where there are long distances and/or interworking between end points. Technology
has improved the reliability and voice quality over time and will continue to improve
VoIP performance as time goes on.
It has been suggested to rely on the packetized nature of media in VOIP communications
and transmit the stream of packets from the source phone to the destination phone
simultaneously across different routes (multi-path routing). In such a way, the temporary
failures have less impact on the communication quality. In capillary routing it has been
suggested to use at the packet level Fountain codes or particularly raptor codes for
transmitting extra redundant packets making the communication more reliable.
DHN 9-18-07 70
A number of protocols have been defined to support the reporting of QoS/QoE for VoIP
calls. These include RTCP XR (RFC3611), SIP RTCP Summary Reports, H.460.9 Annex
B (for H.323), H.248.30 and MGCP extensions. The RFC3611 VoIP Metrics block is
generated by an IP phone or gateway during a live call and contains information on
packet loss rate, packet discard rate (due to jitter), packet loss/discard burst metrics (burst
length/density, gap length/density), network delay, end system delay, signal / noise / echo
level, MOS scores and R factors and configuration information related to the jitter buffer.
RFC3611 VoIP metrics reports are exchanged between IP endpoints on an occasional
basis during a call, and an end of call message sent via SIP RTCP Summary Report or
one of the other signaling protocol extensions. RFC3611 VoIP metrics reports are
intended to support real time feedback related to QoS problems, the exchange of
information between the endpoints for improved call quality calculation and a variety of
Difficulty with sending faxes
The support of sending faxes over VoIP is still limited. The existing voice codecs are not
designed for fax transmission. An effort is underway to remedy this by defining an
alternate IP-based solution for delivering Fax-over-IP, namely the T.38 protocol. Another
possible solution to overcome the drawback is to treat the fax system as a message
switching system which does not need real time data transmission - such as sending a fax
as an email attachment (see Fax) or remote printout (see Internet Printing Protocol). The
end system can completely buffer the incoming fax data before displaying or printing the
The nature of IP makes it difficult to locate network users geographically. Emergency
calls, therefore, cannot easily be routed to a nearby call center, and are impossible on
some VoIP systems. Sometimes, VoIP systems may route emergency calls to a non-
emergency phone line at the intended department. In the US, at least one major police
department has strongly objected to this practice as potentially endangering the public.
Moreover, in the event that the caller is unable to give an address, emergency services
may be unable to locate them in any other way. Following the lead of mobile phone
operators, several VoIP carriers are already implementing a technical work-around. [citation
For instance, one large VoIP carrier requires the registration of the physical address
where the VoIP line will be used. When you dial the emergency number for your country,
they will route it to the appropriate local system. They also maintain their own
emergency call center that will take non-routable emergency calls (made, for example,
from a software based service that is not tied to any particular physical location) and then
will manually route your call once learning your physical location. 
Integration into global telephone number system
DHN 9-18-07 71
While the traditional Plain Old Telephone Service (POTS) and mobile phone networks
share a common global standard (E.164) which allocates and identifies any specific
telephone line, there is no widely adopted similar standard for VoIP networks. Some
allocate an E.164 number which can be used for VoIP as well as incoming/external calls.
However, there are often different, incompatible schemes when calling between VoIP
providers which use provider specific short codes.
Single point of calling
With hardware VoIP solutions it is possible to connect the VoIP router into the existing
central phone box in the house and have VoIP at every phone already connected.
Software based VoIP services require the use of a computer, so they are limited to single
point of calling, though telephone sets are now available, allowing them to be used
without a PC. Some services provide the ability to connect WiFi SIP phones so that
service can be extended throughout the premises, and off-site to any location with an
open hotspot.However, note that many hotspots require browser-based authentication,
which most SIP phones do not support.
Mobile phones & Hand held Devices
Telcos and consumers have invested billions of dollars in mobile phone equipment. In
developed countries, mobile phones have achieved nearly complete market penetration,
and many people are giving up landlines and using mobiles exclusively. Given this
situation, it is not entirely clear whether there would be a significant higher demand for
VoIP among consumers until either public or community wireless networks have similar
geographical coverage to cellular networks (thereby enabling mobile VoIP phones, so
called WiFi phones or VoWLAN) or VoIP is implemented over legacy 3G networks.
However, "dual mode" telephone sets, which allow for the seamless handover between a
cellular network and a WiFi network, are expected to help VoIP become more popular.
Phones like the NEC N900iL, and later the Nokia E60, E61 have been the first "dual
mode" telephone sets capable of delivering mobile VoIP. With more and more mobile
phones and hand held devices using VOIP, the nicknames of "MoIP" and MVoip (Mobile
VoIP)have been attributed to these mobile applications.
Hand held Devices are another type of medium whereby you can use VoIP services.
Since most of these devices are limited to using GSM/GPRS type of communication
mediums, almost all of the hand held devices use WiFi of some sort.
Another addition to hand held devices are ruggedized bar code type devices that are used
in warehouses and retail environments. These type of devices rely on "inside the 4 walls"
type of VoIP services that do not connect to the outside world and are solely to be used
from employee to employee communications.
DHN 9-18-07 72
The many consumer VoIP solutions do not support encryption yet, although having a
secure phone is much easier to implement with VoIP than traditional phone lines. As a
result, it is relatively easy to eavesdrop on VoIP calls and even change their content.
There are several open source solutions that facilitate sniffing of VoIP conversations. A
modicum of security is afforded due to patented audio codecs that are not easily available
for open source applications, however such security through obscurity has not proven
effective in the long run in other fields. Some vendors also use compression to make
eavesdropping more difficult. However, real security requires encryption and
cryptographic authentication which are not widely available at a consumer level. The
existing secure standard SRTP and the new ZRTP protocol is available on Analog
Telephone Adapters(ATAs) as well as various softphones. It is possible to use IPsec to
secure P2P VoIP by using opportunistic encryption. Skype does not use SRTP, but uses
encryption which is transparent to the Skype provider.
The Voice VPN solution provides secure voice for enterprise VoIP networks by applying
IPSec encryption to the digitized voice stream.
Pre-Paid Phone Cards
VoIP has become an important technology for phone services to travelers, migrant
workers and expatriate, who either, due to not having a fixed or mobile phone or high
overseas roaming charges, choose instead to use VoIP services to make their phone calls.
Pre-paid phone cards can be used either from a normal phone or from Internet cafes that
have phone services. Developing countries and areas with high tourist or immigrant
communities generally have a higher uptake.
Caller ID support among VoIP providers varies, although the majority of VoIP providers
now offer full Caller ID with name on outgoing calls. When calling a traditional PSTN
number from some VoIP providers, Caller ID is not supported.
In a few cases, VoIP providers may allow a caller to spoof the Caller ID information,
making it appear as though they are calling from a different number. Business grade
VoIP equipment and software often makes it easy to modify caller ID information.
Although this can provide many businesses great flexibility, it is also open to abuse.
Voice over Instant Messenger(VoIM) is one kind of general VoIP that was based on an
DHN 9-18-07 73
A major development starting in 2004 has been the introduction of mass-market VoIP
services over broadband Internet access services, in which subscribers make and receive
calls as they would over the PSTN. Full phone service VoIP phone companies provide
inbound and outbound calling with Direct Inbound Dialing. Many offer unlimited calling
to the U.S., and some to Canada or selected countries in Europe or Asia as well, for a flat
These services take a wide variety of forms which can be more or less similar to
traditional POTS. At one extreme, an analog telephone adapter (ATA) may be connected
to the broadband Internet connection and an existing telephone jack in order to provide
service nearly indistinguishable from POTS on all the other jacks in the residence. This
type of service, which is fixed to one location, is generally offered by broadband Internet
providers such as cable companies and telephone companies as a cheaper flat-rate
traditional phone service. Often the phrase "VoIP" is not used in selling these services,
but instead the industry has marketed the phrases "Internet Phone", "Digital Phone" or
"Softphone" which is aimed at typical phone users who are not necessarily tech-savvy.
Typically, the provider touts the advantage of being able to keep one's existing phone
At the other extreme are services like Gizmo Project and Skype which rely on a software
client on the computer in order to place a call over the network, where one user ID can be
used on many different computers or in different locations on a laptop. In the middle lie
services which also provide a telephone adapter for connecting to the broadband
connection similar to the services offered by broadband providers (and in some cases also
allow direct connections of SIP phones) but which are aimed at a more tech-savvy user
and allow portability from location to location. One advantage of these two types of
services is the ability to make and receive calls as one would at home, anywhere in the
world, at no extra cost. No additional charges are incurred, as call diversion via the PSTN
would, and the called party does not have to pay for the call. For example, if a subscriber
with a home phone number in the U.S. or Canada calls someone else within his local
calling area, it will be treated as a local call regardless of where that person is in the
world. Often the user may elect to use someone else's area code as his own to minimize
phone costs to a frequently called long-distance number.
DHN 9-18-07 74
For some users, the broadband phone complements, rather than replaces, a PSTN line,
due to a number of inconveniences compared to traditional services. VoIP requires a
broadband Internet connection and, if a telephone adapter is used, a power adapter is
usually needed. In the case of a power failure, VoIP services will generally not function.
Additionally, a call to the U.S. emergency services number 9-1-1 may not automatically
be routed to the nearest local emergency dispatch center, and would be of no use for
subscribers outside the U.S. This is potentially true for users who select a number with an
area code outside their area. Some VoIP providers offer users the ability to register their
address so that 9-1-1 services work as expected.
Another challenge for these services is the proper handling of outgoing calls from fax
machines, TiVo/ReplayTV boxes, satellite television receivers, alarm systems,
conventional modems or FAXmodems, and other similar devices that depend on access
to a voice-grade telephone line for some or all of their functionality. At present, these
types of calls sometimes go through without any problems, but in other cases they will
not go through at all. And in some cases, this equipment can be made to work over a
VoIP connection if the sending speed can be changed to a lower bits per second rate. If
VoIP and cellular substitution becomes very popular, some ancillary equipment makers
may be forced to redesign equipment, because it would no longer be possible to assume a
conventional voice-grade telephone line would be available in almost all homes in North
America and Western-Europe. The TestYourVoIP website offers a free service to test the
quality of or diagnose an Internet connection by placing simulated VoIP calls from any
Java-enabled Web browser, or from any phone or VoIP device capable of calling the
Corporate and telco use
Although few office environments and even fewer homes use a pure VoIP infrastructure,
telecommunications providers routinely use IP telephony, often over a dedicated IP
network, to connect switching stations, converting voice signals to IP packets and back.
The result is a data-abstracted digital network which the provider can easily upgrade and
use for multiple purposes.
Corporate customer telephone support often use IP telephony exclusively to take
advantage of the data abstraction. The benefit of using this technology is the need for
only one class of circuit connection and better bandwidth use. Companies can acquire
their own gateways to eliminate third-party costs, which is worthwhile in some situations.
VoIP is widely employed by carriers, especially for international telephone calls. It is
commonly used to route traffic starting and ending at conventional PSTN telephones.
Many telecommunications companies are looking at the IP Multimedia Subsystem (IMS)
which will merge Internet technologies with the mobile world, using a pure VoIP
infrastructure. It will enable them to upgrade their existing systems while embracing
Internet technologies such as the Web, email, instant messaging, presence, and video
DHN 9-18-07 75
conferencing. It will also allow existing VoIP systems to interface with the conventional
PSTN and mobile phones.
Electronic Numbering (ENUM) uses standard phone numbers (E.164), but allows
connections entirely over the Internet. If the other party uses ENUM, the only expense is
the Internet connection. Virtual PBX (or IP PBX) allows companies to control their
internal phone network over an existing LAN and server without needing to wire a
separate telephone network. Users within this environment can then use standard
telephones coupled with an FXS, IP Phones connected to a data port or a Softphone on
their PC. Internal VoIP phone networks allow outbound and inbound calling on standard
PSTN lines through the use of FXO adapters.
Use in Amateur Radio
Sometimes called Radio Over Internet Protocol or RoIP, Amateur radio has adopted
VoIP by linking repeaters and users with Echolink, IRLP, D-STAR, Dingotel and EQSO.
Echolink and IRLP are programs/systems based upon the Speak Freely VoIP open source
software. In fact, Echolink allows users to connect to repeaters via their computer (over
the Internet) rather than by using a radio. By using VoIP Amateur Radio operators are
able to create large repeater networks with repeaters all over the world where operators
can access the system with actual ham radios.
Ham Radio operators using radios are able to tune to repeaters with VoIP capabilities and
use DTMF signals to command the repeater to connect to various other repeaters, thus
allowing them to talk to people all around the world, even with "line of sight" VHF
Click to call
Main article: Click-to-call
Click-to-call is a service which lets users click a button and immediately speak with a
customer service representative. The call can either be carried over VoIP, or the customer
may request an immediate call back by entering their phone number. One significant
benefit to click-to-call providers is that it allows companies to monitor when online
visitors change from the website to a phone sales channel.
Legal issues in different countries
As the popularity of VoIP grows, and PSTN users switch to VoIP in increasing numbers,
governments are becoming more interested in regulating VoIP in a manner similar to
legacy PSTN services, especially with the encouragement of the state-mandated
telephone monopolies/oligopolies in a given country, who see this as a way to stifle the
DHN 9-18-07 76
In the U.S., the Federal Communications Commission now requires all VoIP operators
who do not support Enhanced 911 to attach a sticker warning that traditional 911 services
aren't available. The FCC recently required VoIP operators to support CALEA wiretap
functionality. The Telecommunications Act of 2005 proposes adding more traditional
PSTN regulations, such as local number portability and universal service fees. Other
future legal issues are likely to include laws against wiretapping and network neutrality.
Some Latin American and Caribbean countries, fearful for their state owned telephone
services, have imposed restrictions on the use of VoIP, including in Panama where VoIP
is taxed. In Ethiopia, where the government is monopolizing telecommunication service,
it is a criminal offense to offer services using VoIP. The country has installed firewalls to
prevent international calls being made using VoIP. These measures were taken after a
popularity in VoIP reduced the income generated by the state owned telecommunication
In the European Union, the treatment of VoIP service providers is a decision for each
Member State's national telecoms regulator, which must use competition law to define
relevant national markets and then determine whether any service provider on those
national markets has "significant market power" (and so should be subject to certain
obligations). A general distinction is usually made between VoIP services that function
over managed networks (via broadband connections) and VoIP services that function
over unmanaged networks (essentially, the Internet).
VoIP services that function over managed networks are often considered to be a viable
substitute for PSTN telephone services (despite the problems of power outages and lack
of geographical information); as a result, major operators that provide these services (in
practice, incumbent operators) may find themselves bound by obligations of price control
or accounting separation.
VoIP services that function over unmanaged networks are often considered to be too poor
in quality to be a viable substitute for PSTN services; as a result, they may be provided
without any specific obligations, even if a service provider has "significant market
The relevant EU Directive is not clearly drafted concerning obligations which can exist
independently of market power (e.g., the obligation to offer access to emergency calls),
and it is impossible to say definitively whether VoIP service providers of either type are
bound by them. A review of the EU Directive is under way and should be complete by
In India, it is legal to use VoIP, but it is illegal to have VoIP gateways inside India. This
effectively means that people who have PCs can use them to make a VoIP call to any
number, but if the remote side is a normal phone, the gateway that converts the VoIP call
to a POTS call should not be inside India.
DHN 9-18-07 77
In the UAE, it is illegal to use any form of VoIP, to the extent that websites of Skype and
Gizmo Project don't work.
In the Republic of Korea, only providers registered with the government are authorized to
offer VoIP services. Unlike many VoIP providers, most of whom offer flat rates, Korean
VoIP services are generally metered and charged at rates similar to terrestrial calling.
Foreign VoIP providers such as Vonage encounter high barriers to government
registration. This issue came to a head in 2006 when internet service providers providing
personal internet services by contract to United States Forces Korea members residing on
USFK bases threatened to block off access to VoIP services used by USFK members of
as an economical way to keep in contact with their families in the United States, on the
grounds that the service members' VoIP providers were not registered. A compromise
was reached between USFK and Korean telecommunications officials in January 2007,
wherein USFK service members arriving in Korea before June 1, 2007 and subscribing to
the ISP services provided on base may continue to use their U.S.-based VoIP
subscription, but later arrivals must use a Korean-based VoIP provider, which by contract
will offer pricing similar to the flat rates offered by U.S. VoIP providers. 
IP telephony in Japan
In Japan, IP telephony (IP 電話 IP Denwa ?) is regarded as a service applied VoIP
technology to whole or a part of the telephone line. As from 2003, IP telephony service
assigned telephone numbers has been provided. There are not voice only services, but
also videophone service. According to the Telecommunication Business Law, the service
category for IP telephony also implies the service provided via Internet, which is not
assigned any telephone number. IP telephony is basically regulated by Ministry of
Internal Affairs and Communications (MIC), as a telecommunication service. The
operators have to disclose necessary information on its quality, etc, prior to making
contract with customers, and have obligation to respond to their complaints cordially.
Many Internet service providers (ISP) are providing IP telephony services. The provider,
which provides IP telephony service, is so-called "ITSP (Internet Telephony Service
Provider)". Recently, the competition among ITSPs has been activated, by option or set
sales, connected with ADSL or FTTH services.
The tariff system normally applied for Japanese IP telephony tends to be described as
• The call between IP telephony subscribers, limited to the same group, is mostly
free of charge.
• The call from IP telephony subscribers to fixed line or PHS is mostly fixed rate,
uniformly, all over the country.
Between ITSP, the interconnection is mostly maintained at VoIP level.
DHN 9-18-07 78
• As for the IP telephony assigned normal telephone number (0AB-J), the condition
for its interconnection is considered same as normal telephony.
• As for the IP telephony assigned specific telephone number (050), the condition
for its interconnection tends to be described as below;
o Interconnection is sometimes charged. (Sometimes, it's free of charge.) In
case of free of charge, mostly, the traffics are exchanged via P2P
connection with the same VoIP standard. Otherwise, certain conversion is
needed at the point of VoIP gateway, which needs running costs.
Telephone number for IP telephony in Japan
Since September 2002, the MIC has assigned IP telephony telephone numbers on the
condition that the service falls into certain required categories of quality. Highly qualified
IP telephony is assigned a telephone number. Normally the number starts with 050. But,
when its quality is so high that customer almost could not tell the difference between it
and a normal telephone and when the provider relates its number with a location and
provides the connection with emergency call capabilities, the provider is allowed to
assign a normal telephone number, which is a so-called "0AB-J" number.
The two major competing standards for VoIP are the IETF standard SIP and the ITU
standard H.323. Initially H.323 was the most popular protocol, though in the "local loop"
it has since been surpassed by SIP. This was primarily due to the latter's better traversal
of NAT and firewalls, although recent changes introduced for H.323 have removed this
However, in backbone voice networks where everything is under the control of the
network operator or telco, H.323 is the protocol of choice. Many of the largest carriers
use H.323 in their core backbones, and the vast majority of callers have little or
no idea that their POTS calls are being carried over VoIP.
Where VoIP travels through multiple providers' softswitches the concepts of Full Media
Proxy and Signalling Proxy are important. In H.323, the data is made up of 3 streams of
data: 1) H.225.0 Call Signaling; 2) H.245; 3) Media. So if you are in London, your
provider is in Australia, and you wish to call America, then in full proxy mode all three
streams will go half way around the world and the delay (up to 500-600 ms) and packet
loss will be high. However in signaling proxy mode where only the signaling flows
through the provider the delay will be reduced to a more user friendly 120-150 ms.
One of the key issues with all traditional VoIP protocols is the wasted bandwidth used for
packet headers. Typically, to send a G.723.1 5.6 kbit/s compressed audio path requires 18
kbit/s of bandwidth based on standard sampling rates. The difference between the 5.6
kbit/s and 18 kbit/s is packet headers. There are a number of bandwidth optimization
techniques used, such as silence suppression and header compression. This can typically
save 35% on bandwidth usage.
DHN 9-18-07 79
VoIP trunking techniques such as TDMoIP can reduce bandwidth overhead even further
by multiplexing multiple conversations that are heading to the same destination and
wrapping them up inside the same packets. Because the packet header overhead is shared
between many simultaneous streams, TDMoIP can offer near toll quality audio with a
per-stream packet header overhead of only about 1 kbit/s.
DHN 9-18-07 80
Voice peering, also called VoIP peering, refers to the forwarding of calls from one ITSP
to another ITSP directly using VoIP technology.
The call is not forwarded over the PSTN and this leads to costs savings (no call charges)
and better quality because there is no transcoding between the VoIP cloud and the PSTN,
and then back from the PSTN to the next VoIP cloud. VoIP peering may occur on Layer
2 basis, i.e. a private network is provided, and carriers connected with it manage peering
between one another, or on a layer 5 basis, i.e. peering occurs on open networks, with
routing and signalling managed by a central provider.
Voice peering or VoIP peering can occur on a bilateral or multilateral basis. Bilateral
peering does not scale when many service providers seek to interconnect and peer with
one another. Standards on Multilateral, layer 5 peering are under development by the
IETF working group on VoIP Peering, SPEERMINT.
DHN 9-18-07 81