0
Introduction to VoIP, RTP and SIP
Archana Kesavan
Product Marketing Manager
1
About ThousandEyes
ThousandEyes delivers visibility into every network your organization relies on.
Founded by network
experts; strong
investor backing
Relied on for
critical operations by
leading enterprises
Recognized as
an innovative
new approach
27 Fortune 500
5 top 5 SaaS Companies
4 top 6 US Banks
2
Telephony – A Brief History
Mr. Watson – Come
here – I want to see
you
Manual Switchboard
3
Telephony – A Brief History
Business Office Callers
Local Exchange
PBX
Individual Callers
International Gateway Tandem Junction/Exchange
PSTN Network
4
Voice-Over-IP
• Set of protocols designed to deliver communication
services over the IP network
• Analog voice converted into data packets to be sent over
the Internet.
• Two phases
• Phase 1: Signaling (SIP)
• Phase 2: Audio transport (RTP)
5
IP Telephony or Voice-Over-IP
Local Exchange
PBX
Individual
Callers
International Gateway Tandem Junction
PSTNNetwork
IP- PBX
VoIP Servers
Ethernet (IP Network)
Internet
SIP Trunk
Mobile
Voice
Mobile
Data
6
Type of VoIP Services
On-Prem
Branch Office
Branch Office
DMZ
CRM
Web
Data Center
IP-PBX
VoIP server
• All hardware and software
owned and managed by
the enterprise.
• IP-PBX and adjoining
systems reside in the
datacenter or on premise.
• Voice packets go through
the LAN and WAN
7
Type of VoIP Services
Hosted
• IP-phones are owned by the
enterprise
• All other equipment and
software located in the
service provider data center
and provided as a service
• VoIP packets travel over the
Internet or dedicated WAN
connectivity to the hosted
site
Data
Center
Branch Office
Branch Office
DMZ
CRMWeb
IP-PBX
VoIP
Provider
Data
Center
Branch Office
Branch Office
DMZ
CRMWeb
IP-PBX
VoIP
Provider
Data
Center
Data
Center
Enterprise A
Enterprise B
8
So how does VoIP work?
• Session Initiation Protocol (SIP)
• Pre-requisite for the voice call
– RFC 3261: Standard protocol (however
propriety versions exist to force vendor
lock-down)
– Application level protocol residing above
TCP/IP stack
– TCP or UDP
– Text-based protocol like HTTP
– Encrypted with TLS
– Response Codes indicates the state of
the request message
Phase1: Signaling
VoIP Phone A SIP Server/Proxy VoIP Phone B
SIP RegisterSIP Register
SIP INVITE
100 Trying
SIP INVITE
180 Ringing
180 Ringing
200 OK
200 OK
AUDIO CALL
SIP BYE
200 OK
SIP ACK SIP ACK
SIP BYE
200 OK
9
So how does VoIP work?
Phase1: Signaling
VoIP Phone A SIP Server/Proxy VoIP Phone B
SIP RegisterSIP Register
SIP INVITE
100 Trying
SIP INVITE
180 Ringing
180 Ringing
200 OK
200 OK
AUDIO CALL
SIP BYE
200 OK
SIP ACK SIP ACK
SIP BYE
200 OK
REGISTER
INVITE
CONNECT
DISCONNECT
10
• Real Transport Protocol (RTP)
– Analog voice signals converted into data
packets and sent over UDP
– Audio frames are encapsulated in RTP
packets
– RTP packets are encapsulated in UDP
packets
– UDP packets are encapsulated in IP
packets
So how does VoIP work?
Phase 2: Audio Data
IP
header
UDP
header
Frame 1
RTP
header
Frame 2
RTP Audio Stream
SIP Network
11
• How voice traffic is encoded and decoded
• Determines the quality of the VoIP conversation
• G.711, G.722, SILK
Key VoIP Concepts
• MoS
• Latency
• Jitter (De-Jitter buffer)
• PDV
Codecs
VoIP Metrics
• Prioritization of VoIP Traffic
• DSCP codes
– Traffic shaping, firewall and LB configuration
– 3 bits for class: Best effort, Assured Forwarding,
Expedited Forwarding, Voice Admit
QoS
12
Time
Packet delay (from sender to receiver)
Latency
Packet 1 Packet 2 Packet 4Sent at
Packet 1 Packet 2 Packet 4Received at Packet 3
Packet 3
Latency LatencyLatency Latency
13
Time
Variation of the latency
Jitter
Packet 1 Packet 2 Packet 4Sent at
Packet 1 Packet 2 Packet 4Received at Packet 3
Packet 3
Min Latency Max Latency
14
Time
99.9th percentile of the packet delay variation
Packet Delay Variation
Packet 1 Packet 2 Packet 4Sent at
Packet 1 Packet 2 Packet 4Received at Packet 3
Packet 3
Played at
Delayed
playback
Min Latency Max Latency PDV = max latency – min latency
De-jitter buffer should be able to
accommodate PDV.
15
E-Model (ITU-T Recommendation G.107, 1998-2014)
Based on a mathematical model in which the individual transmission parameters are transformed into
different individual "impairment factors” such as codec characteristics, delay, loss ratio, discard ratio,
etc., to obtain a quality metric called R factor:
Mean Opinion Score (MOS)
Basic signal-
to-noise ratio
Delay
impairment
Equipment
impairment
Advantage
factor
(expectation)
• Network latency
• De-jitter buffer size
• Ie (codec)
• Packet loss robustness (codec)
• Packet loss probability
• Network latency
Simultaneous
impairment
1 Calculation of the transmission rating factor, R
ccording to the equipment impairment factor method, the fundamental principle of the E-model
ased on a concept given in the description of the OPINE model (see [b-ITU-T P-Sup.3]).
sychological factors on the psychological scale are additive.
he result of any calculation with the E-model in a first step is a transmission rating factor R, whic
ombines all transmission parameters relevant for the considered connection. This rating factor R
omposed of:
AIe-effIdIsRoR +−−−= (7-1
o represents in principle the basic signal-to-noise ratio, including noise sources such as circu
oise and room noise. Factor Is is a combination of all impairments which occur more or le
multaneously with the voice signal. Factor Id represents the impairments caused by delay and th
fective equipment impairment factor Ie-eff represents impairments caused by low bit-rate codec
also includes impairment due to randomly distributed pack losses. The advantage factor A allow
r compensation of impairment factors when the user benefits from other types of access to th
ser. The term Ro and the Is and Id values are subdivided into further specific impairment value
he following clauses give the equations used in the E-model.
16
• Data records that contain specific information about a call. For
eg, timestamp, call duration etc
• CDRs are generated at specified triggers
• Record call quality, loss, latency experienced
• Billing, Law Enforcement
Monitoring Techniques for VoIP
• Packet sniffer that can record every SIP and RTP transaction
• Can typically decode speech and replay for call quality analysis
• Detect MOS score and other voice metrics
• Maximum overhead
Call Detail
Records
Packet Capture
• Simulate VoIP traffic from strategic vantage points in periodic
intervals
• Quickly pinpoint when and where an issue occurs
• Real time detection of voice quality degradation
• Less overhead
Active
Monitoring
17
Demo
• Dip in MOS score due to DSCP change
• https://earhhpng.share.thousandeyes.com
18
VoIP Metrics
Average of packet
delays
99.9th percentile of
packet delay
variation
Packets dropped by
the de-jitter buffer
Packets dropped by
the network
MOS Score (1-5)
Audio codec used
Source
Destination
19
Thank You!

Introduction to VoIP, RTP and SIP

  • 1.
    0 Introduction to VoIP,RTP and SIP Archana Kesavan Product Marketing Manager
  • 2.
    1 About ThousandEyes ThousandEyes deliversvisibility into every network your organization relies on. Founded by network experts; strong investor backing Relied on for critical operations by leading enterprises Recognized as an innovative new approach 27 Fortune 500 5 top 5 SaaS Companies 4 top 6 US Banks
  • 3.
    2 Telephony – ABrief History Mr. Watson – Come here – I want to see you Manual Switchboard
  • 4.
    3 Telephony – ABrief History Business Office Callers Local Exchange PBX Individual Callers International Gateway Tandem Junction/Exchange PSTN Network
  • 5.
    4 Voice-Over-IP • Set ofprotocols designed to deliver communication services over the IP network • Analog voice converted into data packets to be sent over the Internet. • Two phases • Phase 1: Signaling (SIP) • Phase 2: Audio transport (RTP)
  • 6.
    5 IP Telephony orVoice-Over-IP Local Exchange PBX Individual Callers International Gateway Tandem Junction PSTNNetwork IP- PBX VoIP Servers Ethernet (IP Network) Internet SIP Trunk Mobile Voice Mobile Data
  • 7.
    6 Type of VoIPServices On-Prem Branch Office Branch Office DMZ CRM Web Data Center IP-PBX VoIP server • All hardware and software owned and managed by the enterprise. • IP-PBX and adjoining systems reside in the datacenter or on premise. • Voice packets go through the LAN and WAN
  • 8.
    7 Type of VoIPServices Hosted • IP-phones are owned by the enterprise • All other equipment and software located in the service provider data center and provided as a service • VoIP packets travel over the Internet or dedicated WAN connectivity to the hosted site Data Center Branch Office Branch Office DMZ CRMWeb IP-PBX VoIP Provider Data Center Branch Office Branch Office DMZ CRMWeb IP-PBX VoIP Provider Data Center Data Center Enterprise A Enterprise B
  • 9.
    8 So how doesVoIP work? • Session Initiation Protocol (SIP) • Pre-requisite for the voice call – RFC 3261: Standard protocol (however propriety versions exist to force vendor lock-down) – Application level protocol residing above TCP/IP stack – TCP or UDP – Text-based protocol like HTTP – Encrypted with TLS – Response Codes indicates the state of the request message Phase1: Signaling VoIP Phone A SIP Server/Proxy VoIP Phone B SIP RegisterSIP Register SIP INVITE 100 Trying SIP INVITE 180 Ringing 180 Ringing 200 OK 200 OK AUDIO CALL SIP BYE 200 OK SIP ACK SIP ACK SIP BYE 200 OK
  • 10.
    9 So how doesVoIP work? Phase1: Signaling VoIP Phone A SIP Server/Proxy VoIP Phone B SIP RegisterSIP Register SIP INVITE 100 Trying SIP INVITE 180 Ringing 180 Ringing 200 OK 200 OK AUDIO CALL SIP BYE 200 OK SIP ACK SIP ACK SIP BYE 200 OK REGISTER INVITE CONNECT DISCONNECT
  • 11.
    10 • Real TransportProtocol (RTP) – Analog voice signals converted into data packets and sent over UDP – Audio frames are encapsulated in RTP packets – RTP packets are encapsulated in UDP packets – UDP packets are encapsulated in IP packets So how does VoIP work? Phase 2: Audio Data IP header UDP header Frame 1 RTP header Frame 2 RTP Audio Stream SIP Network
  • 12.
    11 • How voicetraffic is encoded and decoded • Determines the quality of the VoIP conversation • G.711, G.722, SILK Key VoIP Concepts • MoS • Latency • Jitter (De-Jitter buffer) • PDV Codecs VoIP Metrics • Prioritization of VoIP Traffic • DSCP codes – Traffic shaping, firewall and LB configuration – 3 bits for class: Best effort, Assured Forwarding, Expedited Forwarding, Voice Admit QoS
  • 13.
    12 Time Packet delay (fromsender to receiver) Latency Packet 1 Packet 2 Packet 4Sent at Packet 1 Packet 2 Packet 4Received at Packet 3 Packet 3 Latency LatencyLatency Latency
  • 14.
    13 Time Variation of thelatency Jitter Packet 1 Packet 2 Packet 4Sent at Packet 1 Packet 2 Packet 4Received at Packet 3 Packet 3 Min Latency Max Latency
  • 15.
    14 Time 99.9th percentile ofthe packet delay variation Packet Delay Variation Packet 1 Packet 2 Packet 4Sent at Packet 1 Packet 2 Packet 4Received at Packet 3 Packet 3 Played at Delayed playback Min Latency Max Latency PDV = max latency – min latency De-jitter buffer should be able to accommodate PDV.
  • 16.
    15 E-Model (ITU-T RecommendationG.107, 1998-2014) Based on a mathematical model in which the individual transmission parameters are transformed into different individual "impairment factors” such as codec characteristics, delay, loss ratio, discard ratio, etc., to obtain a quality metric called R factor: Mean Opinion Score (MOS) Basic signal- to-noise ratio Delay impairment Equipment impairment Advantage factor (expectation) • Network latency • De-jitter buffer size • Ie (codec) • Packet loss robustness (codec) • Packet loss probability • Network latency Simultaneous impairment 1 Calculation of the transmission rating factor, R ccording to the equipment impairment factor method, the fundamental principle of the E-model ased on a concept given in the description of the OPINE model (see [b-ITU-T P-Sup.3]). sychological factors on the psychological scale are additive. he result of any calculation with the E-model in a first step is a transmission rating factor R, whic ombines all transmission parameters relevant for the considered connection. This rating factor R omposed of: AIe-effIdIsRoR +−−−= (7-1 o represents in principle the basic signal-to-noise ratio, including noise sources such as circu oise and room noise. Factor Is is a combination of all impairments which occur more or le multaneously with the voice signal. Factor Id represents the impairments caused by delay and th fective equipment impairment factor Ie-eff represents impairments caused by low bit-rate codec also includes impairment due to randomly distributed pack losses. The advantage factor A allow r compensation of impairment factors when the user benefits from other types of access to th ser. The term Ro and the Is and Id values are subdivided into further specific impairment value he following clauses give the equations used in the E-model.
  • 17.
    16 • Data recordsthat contain specific information about a call. For eg, timestamp, call duration etc • CDRs are generated at specified triggers • Record call quality, loss, latency experienced • Billing, Law Enforcement Monitoring Techniques for VoIP • Packet sniffer that can record every SIP and RTP transaction • Can typically decode speech and replay for call quality analysis • Detect MOS score and other voice metrics • Maximum overhead Call Detail Records Packet Capture • Simulate VoIP traffic from strategic vantage points in periodic intervals • Quickly pinpoint when and where an issue occurs • Real time detection of voice quality degradation • Less overhead Active Monitoring
  • 18.
    17 Demo • Dip inMOS score due to DSCP change • https://earhhpng.share.thousandeyes.com
  • 19.
    18 VoIP Metrics Average ofpacket delays 99.9th percentile of packet delay variation Packets dropped by the de-jitter buffer Packets dropped by the network MOS Score (1-5) Audio codec used Source Destination
  • 20.

Editor's Notes

  • #2 \
  • #4 PSTN (public switched telephone network) is the world's collection of interconnected voice-oriented public telephone networks, both commercial and government-owned. It's also referred to as the Plain Old Telephone Service (POTS).
  • #8  Ser
  • #9  Ser