2. SIP for geeks!
SIP [acronym] –
Session Initiation Protocol
“a very simple text-based application-
layer control protocol to create, modify
and terminate sessions such as
Internet telephony and multimedia
conferences with one or more
participants.”
geek [slang] – noun
“a computer expert or
enthusiast (a term of pride as
self-reference, but often
considered offensive when
used by outsiders.)”
3. Who am I?
PhD from Columbia University in 2006
on reliable, scalable & interoperable
Internet telephony
Student of Prof. Henning Schulzrinne
who is co-inventor of SIP, RTP, RTSP
Worked at Adobe with Dr. Henry Sinnreich
who is considered God Father of SIP
Worked on SIP
since 1999, building prototypes, systems.
Worked at Tokbox and 6Connex
on web-based video communication
Worked on open source projects
p2p-sip, videocity, flash-videoio, restlite,
rtmplite, siprtmp
VoIP researcher
Software architect
Systems engineer
4. What will you learn?
History
Competition
IETF, RFC
Ecosystem
References
Dive-in
Syntax
Semantics
Call flow
Extensions
Scripting
Hands-on
Iptel.org
Jain-sipapi
39peers
siprtmp
Challenges
On Web
NAT traversal
Security
Audio quality
Walled garden
Voice
Sampling
Encoding
Transport
Motivation
What is the big picture?
How does everything fit together?
Where to look for more information?
15 min 45 min 30 min 15 min 15 min
= total 2hr
5. …
– Digitization (e.g., sampling at 8kHz, 16 bits per sample, i.e,
128 kb/s or 320 bytes per 20 ms)
– Real-time compression/encoding (e.g., G.729A at 8 kb/s)
– Transport to remote IP address and port number over UDP
(Why not TCP?)
– Processing on receiver side is the reverse
Audio packet transfer
6. +127
+0
-127
10101111…01101101
Sample at twice the
highest voice frequency
2 x 4000=8000 Hz
(interval of 125 µsec)
Round off samples to one
of 256 levels (introduces
noise)
Encode each quantized
sample into 8 bit code word
PCM: 8000 x 8 bits = 64 kb/s
Other techniques (differential
coding, linear prediction)
2.4 kb/s to 64 kb/s
What is narrowband/wideband?
What are frame vs sample-based
codecs?
Sampling, Quantization, Encoding
7. Voice codecs
Codecs Bitrate kb/s Use cases
G.711 64 Phone, PSTN
G.729 8 VoIP, carriers
G.723.1 5.3/6.3 VoIP, modem
G.722.2 6-24 Wideband VoIP, mobile
Speex 2-44 Good quality, free, Flash Player
iLBC 13.3/15.2 Free, low bit-rate,
SILK 6-40 Skype, open source
iSAC 10-32 Global IP sound (GIPS), Gtalk
8. Unreliable UDP
a) Packet loss
b) Out-of-order (very rarely)
c) Jitter (delay variation)
1 2 3 5 7 6
1 2 3 4 5 6 7
timeline
Sender
Receiver
(a)
(b)
Problems with UDP
9. Playout buffer
playout buffer
while (true) {
buf = read(au,20ms); //blocks
if (!silence)
sendto(remote, buf);
…
buf = get(20 ms);
write(au, buf);
}
20 ms
packet
microphone
sendto(remote IP:port)
read
speaker
20 ms
packet
write get
Received
packet
recvfrom()
put
while (true) {
buf = recvfrom(...); // blocks
put(buf);
}
10. Sender
Receiver
Playout buffer
1 2 3 5 7 6
1 2 3 4 5 6 7
8 9 0 2 3 4
8 9 0 1 2 3 4
– What is the its purpose?
– What should be the buffer size?
– How to do adaptive delay adjustment?
– Why do you need sequence number?
3
2
1
1 2
1 2
1 2 3
5
7
5
7
6
8
7
9
8
0
9 0
2 3
2
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.9716
11. 1 2 3 4
1 2 3 4
Sender
Receiver
5 6 7
5 6 7Silence …
t1 t2 t3 t4 t5 t6 t7 t8 t9
Playout time vs packet loss detection
Timestamp vs sequence number
• Silence suppression
• Variable length packets
12. Encoded
Audio
RTP Header
UDP header
IP header
msg
sendto(…, msg, …)
recvfrom(…, msg, …)
Sequence number
Optional contributors’ list (CSrc)
Source identifier (SSrc)
Timestamp (proportional to sampling time)
Payload typeCC MV P X
Real-time Transport Protocol (RTP)
RTP: media transport
RTCP: QoS feedback
8 bits 8 bits 16 bits
14. RTP-based conference
Σ ⇔
Mixer Transcoder
µ-law
µ-law
G.729
G.729
µ-law
µ-law
Mixer mixes multiple streams,
and puts rtp.ssrcs of
contributors in the mixed packet
as rtp.csrc
Transcoder converts one
encoding to another. Typically
to accommodate
heterogeneous bandwidth links.
15. RTP FAQ
– Who uses RTP?
– Is RTP a transport or application protocol?
– Is RTP secure? What are SRTP, ZRTP?
– Is RTP header a big overhead?
– Is RTCP needed for two-party voice calls?
http://www.cs.columbia.edu/~hgs/rtp/
16. Why do you need signaling?
Alice
128.59.19.194
Bob
202.16.49.27
Sam
154.28.32.112
Henry
125.33.2.81
Bob=>192.1.2.3
Sam
154.28.32.112
Henry=>125.33.2.81
Alice=>128.59.19.194
Where is Alice?
128.59.19.194
INVITE for a call using µ-law and G.729 at 202.16.49.27:8000
OK using µ-law at 128.59.19.194
1. Locate destination user
2. Negotiate session parameters
17. What will you learn?
History
Competition
IETF, RFC
Ecosystem
References
Dive-in
Syntax
Semantics
Call flow
Extensions
Scripting
Hands-on
iptel.org
jain-sipapi
39peers
siprtmp
Challenges
On Web
NAT traversal
Security
Audio quality
Walled garden
Voice
Sampling
Encoding
Transport
Motivation
20. columbia.edu yahoo.com
home.com
office.com
Alice
Bob
128.59.19.194
Lookup
Two stage lookup similar to email:
DNS: to locate server for a domain
Database: locate user within a domain
Jane
128.59.19.61$ dig –t naptr columbia.edu
columbia.edu. 3600 IN NAPTR 1 0 "s" "SIP+D2U" "" _sip._udp.columbia.edu.
columbia.edu. 3600 IN NAPTR 2 0 "s" "SIP+D2T" "" _sip._tcp.columbia.edu.
$ dig –t srv _sip._udp.columbia.edu
_sip._udp.columbia.edu. 3600 IN SRV 10 10 5060 cocoa.cc.columbia.edu.
_sip._udp.columbia.edu. 3600 IN SRV 10 10 5060 eclair.cc.columbia.edu.
$ dig –t a cocoa.cc.columbia.edu
cocoa.cc.columbia.edu. 3600 IN A 128.59.59.199
How to do failover and load sharing?
21. INVITE sip:alice@home.com SIP/2.0
Via: SIP/2.0/UDP 202.16.49.27:5060;branch=z9hG4bK74bf9
Max-Forwards: 70
From: “Bob” <bob@office.com>;tag=9fxced76s1
To: “Alice” <alice@home.com>
Call-ID: 384827@202.16.49.27
CSeq: 1 INVITE
Contact: <sip:bob@202.16.49.27>
Subject: How are you?
Content-Type: application/sdp
Content-Length: 151
...
SIP/2.0 200 OK
Via: SIP/2.0/UDP 202.16.49.27:5060;branch=z9hG4bK74bf9
From: “Bob” <bob@office.com>;tag=9fxced76s1
To: “Alice” <alice@home.com>;tag=8321234356
Call-ID: 384827@202.16.49.27
CSeq: 1 INVITE
Contact: <sip:alice@128.59.19.194>
Content-Type: application/sdp
Content-Length: 147
...
Request
Response
What are WS and CRLF?
22. Requests
Purpose Method Reference
Establish a SIP session with offer/answer INVITE RFC 3261
Acknowledge a response to INVITE ACK
Cancel a pending request CANCEL
Terminate an existing SIP session BYE
Query the capabilities of server or UA OPTIONS
Temporarily bind AoR to device URI REGISTER
Establish a session to receive updates SUBSCRIBE RFC 3265
Deliver updates in a subscribed session NOTIFY
Upload status to a server PUBLISH RFC 3903
Ask another UA to act upon a URI REFER RFC 3515
Transport an instant message (IM) MESSAGE RFC 3428
Update session state information UPDATE RFC 3311
Acknowledge a provisional response PRACK RFC 3262
Transport mid-call signaling data INFO RFC 2976
23. Class Code Examples
Provisional 1xx 100 Trying
180 Ringing
183 Session Progress
Success 2xx 200 OK
202 Accepted
Redirection 3xx 300 Moved
302 Multiple Choices
Client error 4xx 400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
486 Busy Here
Server error 5xx 501 Not Implemented
503 Service Unavailable
Global error 6xx 600 Busy Everywhere
603 Decline
Responses
24. Name Purpose Reference
Alert-Info Customized ringing RFC 3261
Record-Route Proxy will stay in SIP signaling path
Supported List of supported extensions
Replaces Call control, transfer, pickup RFC 3891
Join Conferencing RFC 3911
Reason Reason for failure response RFC 3326
Event Associated event package RFC 3265
Referred-By Third-party who initiated the request RFC 3892
Header examples
25. Example call flow
INVITE INVITE INVITE
100 Trying
100 Trying
180 Ringing
180 Ringing180 Ringing
200 OK200 OK
200 OK
ACK
BYE
200 OK
RTP media session
bob@office.com alice@home.com
office.com home.com
UA
Proxy Proxy
UA
SIP trapezoid
What is an outbound proxy?
26. Transport
• UDP
– Most common
– Low state overhead
– But small max packet size
• TCP:
– Use with SSL
– Large message bodies
– NAT/firewall traversal
– Connection setup overhead
– Head of line blocking for
trunks (use SCTP instead)
• Transport reliability
– Request retransmissions
– Exponential back-off
– INVITE vs non-INVITE
Why is ACK needed for INVITE response?
27. Message routing
Response follows the reverse request path
– Via header in SIP message records the request path
Request routing decision at each hop
– Usually direct end-to-end transport after initial
request
– Forcing request path: Record-route and
Route headers.
– Request forking: parallel vs sequential (use q-
value in Contact)
– Caller and callee info: further govern request
routing
Via: a.home.com
Via: b.example.com
Via: a.home.com
Via: c.yahoo.com
Via: b.example.com
Via: a.home.com
alice@home.com bob@example.com bob@yahoo.com bob@ip2.yahoo.com
Via: a.home.com Via: b.example.com
Via: a.home.com
Via: c.yahoo.com
Via: b.example.com
Via: a.home.com
q=1.0
q=0.7
q=0.2
INVITE 302 moved
486 busy
200 OK
28. (3) invite (4) moved
(5)
@school.edu
Bob
@home.com
Example call setup
(6)
(6)
(6)
unavailable (7)
Alice
(8)
(11) cancel
(12) ok
(13)
(1) invite
(2) moved
@yahoo.com
@residence.net
@visiting.com
@lab.school.edu
(9) ok
(10)
29. Elements
• SIP user agent
– IP phone, PC, conference bridge,…
• SIP redirect server
– returns new location for requests
• SIP stateless proxy
– routes call requests
• SIP (forking) stateful proxy
– routes call requests
• SIP registrar
– accepts name to address mapping
• Location server
– maintains name to address mapping
• Maintaining state
– stateless: each request and each
response handled independently
• Fast load balancing proxies;
robust
– (transaction) stateful: remember a
whole request/response
transaction
• Enterprise servers, . . .
– call stateful: remember a call from
beginning to end
• Billing, NAT traversal, . . .
Typically implemented in a single
software or box
Other entities: outbound proxy, back-to-
back user agent (b2bua), application-level-
gateway (ALG), …
30. SIP FAQ
– What is a SIP transaction?
– What is a SIP dialog?
– What is early media?
– What is re-INVITE?
– Why do you need record-route?
– What is request forking?
http://www.cs.columbia.edu/sip/
31. AliceBob
INVITE alice@home.com
I can support µ-law and G.729
Send me audio at 202.16.49.27:6780
OK; I can support µ-law
Send me audio at 128.59.19.194:8000
202.16.49.27 128.59.19.194
To port 8000
RTP
To port 6780
RTP
Session negotiation
ACK
How to modify media session?
32. INVITE sip:alice@home.com SIP/2.0
...
Content-Type: application/sdp
Content-Length: 151
v=0
o=bob 26172 27162 IN IP4 202.16.49.27
s=-
c=IN IP4 202.16.49.27
t=0 0
m=audio 6780 RTP/AVP 0 8 5 97 98
a=rtpmap:97 iLBC/8000
a=rtpmap:98 telephone-event/8000
m=video 6790 RTP/AVP 31
Session Description Protocol (SDP)
What is offer/answer?
What is telephone-event?
33. SIP is…, SIP is not …
• Core protocol for establishing sessions
• Allows transport of session description
• Allows change of parameters in mid-session
• Terminate session
• NOT for distribution of multimedia data
• NOT suitable for media gateway control
• . . .
SIP applications typically fall in following categories:
setting up VoIP calls
setting up multimedia conferences
event notification => IM and presence
text and general messaging
signaling transport
34. What is the real value in SIP?
Open system
Advanced services
35. Telephony
Call routing
speed dial, call forwarding,
“follow me”, filtering/blocking
(in/out), do-not-disturb,
distinctive ringing,…
Call handling
auto-answer, auto-attendant,
voice-mail, …
Multi-party
call waiting, call transfer
(blind/consultative),
conference call, park, pickup,
music-on-hold, monitoring, …
what phone has is not enough!
Internet
Presence enabled
place call only if callee is
available, invite participants
when all are online and not
busy, …
Unified messaging
receive email, IM alert for new
voice mail, or when someone
joins your conference, …
Web enabled
click-to-call, web conference,
view conference status and
voice-mails on web, …
Programmable services
36. Programming services
If somebody is calling the third-time, allow mobile.
Try office and home in parallel, if that fails, try home.
Allow call to mobile if I’ve talked to person before.
If on tele-marketing list, forward to dial-a-joke.
Try office during day, and mobile in evening.
…
37. Where do the services reside?
Make call
when boss is
online …
B2BUA
Double ringing
sound when
boss calls…
Endpoint
Proxy/registrar
Endpoint
Service control in endpoint vs network?
Forward to office phone
during day, and home
phone during evening…
Enter your
authentication
PIN for billing…
Use finger for
locating user…
38. Endpoint call control
• Language for End System Services (LESS)
• Direct user interaction, direct media control
• Handle converged information, e.g., call, presence, email
Example: call a friend when he comes online
<less name="online_call" require="generic presence ui">
<notification status="online" priority="0.5">
<address-switch field="origin">
<address is=“alice@home.com">
<call />
<alert sound=“ring.au" text="Calling …" />
</address>
</address-switch>
</notification>
</less>
41. B2BUA and 3pcc
Back-to-back UA
– Incoming call
triggers outgoing
call
Services
– Calling card
– Anonymizer
INVITE
A
B
C
SIP
SIP
OK (SDP1)
ACK
INVITE (SDP1)
OK (SDP2)
ACK
INVITE (SDP2)
OK
ACK
43. VoiceXML
Telephone
PSTNPSTN
Voice gateway
Web server
Service logic (CGI, servlet, JSP)
1. Voice and telephony functions
2. VoiceXML browser
Internet userVXMLVXML HTMLHTML
InternetInternet
IVR platform
• Voice and telephony functions
(ASR, TTS, DTMF)
• Service logic (application specific)
VXML BrowserGateway
44. VoiceXML contd.
<form action=“url”>
Enter your Id:
<input name=‘id’>
<input type=‘submit’>
</form>
<form>
<field name=‘id’>
<prompt>
Your ID, please.
</prompt>
</field>
<block>
<submit next=“url”/>
</block>
</form>
Telephony, speech synthesis or audio output, user input and
grammar, program flow, variable and properties, error
handling, …
45. Interworking with telephone
• Translating audio (µ-law/A-law)
• Translating signaling (PRI/T1,ISUP)
– Overlap signaling
– Advanced features in SIP are lost in PSTN
• Translating identifiers (phone number)
• Determining transition points
Telephone
network
SIP/PSTN
gateway
SIP server IP endpointTelephone
subscriber
+1-415-123-4567 sip:bob@home.com
IP to telephone
Static mapping
– 1-212854xxxx=>@gw1.columbia.edu
Gateway information is dynamic:
– Overlapping networks, multiple providers,
Load balancing
Telephony routing over IP (TRIP)
– Route advertisement, can be
implemented in outbound proxy, suitable
for current hierarchical network
+1 @service.mci.com at 4¢/min
+1212 @nyc.gw.com at 1¢/min
+1212939 @itgw1.columbia.edu free
Telephone to IP
Gateway knows the SIP server
– <sip:4567@gateway2.example
.com;user=phone>
ENUM – E164 numbering (using DNS)
– +1 212 9397042 =>
2.4.0.7.9.3.9.2.1.2.1.e164.arpa =>
sip:hgs@cs.columbia.edu
– Suitable for relatively “static”
contacts
46. Summary of services
• Call forwarding: basic INVITE behavior
• Call transfer: REFER method
• Call hold: set media to 0.0.0.0
• Caller id: From, plus extensions
• DTMF carriage: carry as RTP (RFC 2833)
• Calling card: B2BUA + voice server
• Voicemail: UA, proxy, media server
• Programming: CGI, CPL, servlet, LESS,
CCXML, VoiceXML, SECE, …
47. What will you learn?
History
Competition
IETF, RFC
Ecosystem
References
Dive-in
Syntax
Semantics
Call flow
Extensions
Scripting
Hands-on
iptel.org
jain-sipapi
39peers
siprtmp
Challenges
On Web
NAT traversal
Security
Audio quality
Walled garden
Voice
Sampling
Encoding
Transport
Motivation
48. Hands-on exercises
• Experiment with iptel.org to understand
SIP message format
• Walk-through of JAIN SIP API reference
implementation
• Programming server with SIP express
router (SER)
• Walk-through of 39peers Python stack
• Experiment with siprtmp for Flash-to-SIP
call
49. What will you learn?
History
Competition
IETF, RFC
Ecosystem
References
Dive-in
Syntax
Semantics
Call flow
Extensions
Scripting
Hands-on
iptel.org
jain-sipapi
39peers
siprtmp
Challenges
On Web
NAT traversal
Security
Audio quality
Walled garden
Voice
Sampling
Encoding
Transport
Motivation
52. Browser Cloud
HTML
× Device capture
× Real-time codecs
× E2E UDP media
Flash Player
× H.264 Video encoder
× Echo cancellation
× UDP media
× Server socket
Hosted & elastic
– Utility billing
– Programmable IVR
– Conferencing
– Recording/playback
– Accounting
– Tracking
Phone
– Voice, SMS, …
53. Peer-to-peer ≠ cloud computing
• Self management
• Free resource sharing
• No central co-ordination
• …
• Self management
• Utility computing
• Central co-ordination
• …
managed
54. When to do P2P?
if
– most of the peers do not trust each other,
AND
– There is no incentive to help peers
then
– P2P does not evolve naturally to work
See http://p2p-sip.blogspot.com/2009/10/security-in-p2p-sip.html
55. NAT traversal
Problem: Solutions:
Smart servers
– SER allows detecting nodes
behind a NAT
– Use application level gateway and
media-proxy
SIP Signaling
– Symmetric response routing for
UDP (rport)
– Connection reuse for TCP/TLS
(sip-outbound)
Media
– STUN: Simple traversal of UDP
through NAT
– TURN: Traversal using relay NAT
– ICE: Interactive connectivity
establishment
L=10.1.2.3:5060
REGISTER alice@iptel.org
Contact: sip:alice@10.1.2.3:5060
. . .
E=128.59.19.194:8123
iptel.org
INVITE alice@
58. SIP is just a tool!
How do you use it?
Open house vs walled garden
59. What will you learn?
History
Competition
IETF, RFC
Ecosystem
References
Dive-in
Syntax
Semantics
Call flow
Extensions
Scripting
Hands-on
iptel.org
jain-sipapi
39peers
siprtmp
Challenges
On Web
NAT traversal
Security
Audio quality
Walled garden
Voice
Sampling
Encoding
Transport
Motivation
60. IETF’s SIP
End-to-end principle
Inspired by Internet & web
One task for one protocol
ITU-T’s H.323
Managed “services”
Legacy of telecom network
Complex integrated spec
IP and lower layers
TCP UDP
TPKT
Q.931 H.245 RAS RTCP
RTP
Codecs
Terminal Control/Devices
Transport Layer
SIP SDP
RTP
Codecs
RTCP
Terminal Control/DevicesJabber/XMPP
Jingle for session initiation
Re-uses many SIP features
Proprietary
Adobe’s RTMFP,
Skype, Yahoo, MSN,
Cisco’s Skinny, …
Winner among VoIP carriers,
mobile operators, and digital voice
providers
61. Brief history of SIP
1996: avt to focus on real-time transmission over UDP
1998: mmusic to focus on Internet conferencing
1999: First SIP proposed standard RFC 2543
by M. Handley, H. Schulzrinne, E. Schooler, J. Rosenberg
1999: sip to focus on core development of SIP
2000: iptel to focus on routing and call processing
2000: enum to focus on DNS based phone numbers
2002: New SIP proposed standard RFC 3261, added authors
2002: sipping to focus on applications/extensions to SIP
2004: simple to focus on SIP-based IM/presence
2005: xcon to focus on centralized conferences
2005: behave to focus on NAT traversal for SIP, RTP
2006: p2psip to focus on peer-to-peer rendezvous for SIP
62. Specification in 140+ documents
1. RFC 3261: SIP: Session Initiation Protocol
2. RFC 4566: SDP: Session Description Protocol
3. RFC 3550: RTP: a transport protocol for real-time applications
4. RFC 3264: An offer/answer model with SDP
5. RFC 3840: Indicating UA capabilities in SIP
6. RFC 3263: Locating SIP servers
7. RFC 4474: Enhancements for authenticated identity management in SIP
8. RFC 3265: SIP-specific event notification
9. RFC 3856: A presence event-package for SIP
10. RFC 3863: Presence information data format (PIDF)
11. RFC 3428: SIP extension for instant messaging
12. RFC 3581: An extension of SIP for symmetric response routing
13. RFC 5248: Interactive connectivity establishment (ICE)
14. RFC 5389: Session traversal utilities for NAT (STUN)
15. RFC 5766: Traversal using relays around NAT (TURN)
RFC 5638: Simple SIP usage scenarios for applications in the endpoints
RFC 5411: A hitchhiker’s guide to SIP
Basic
Advanced
Presence/IM
NAT traversal
63. Why was SIP invented?
Invented as a “rendezvous” function
To locate another SIP device
To locate a SIP server to find another SIP device
To establish (separate) media session
To modify existing session
To express SIP device’s capabilities
To request 3rd
party call control
To find status, capability, availability of another SIP device or user
To subscribe to receive future updates of status/availability
To exchange mid-session signaling data
To exchange short instant messages
To support provider’s closed network of “managed” services