SIP for geeks


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • RFC 3550, 3551
  • SIP for geeks

    1. 1. Kundan Singh Nov 2010 SIP for geeks!
    2. 2. SIP for geeks! SIP [acronym] – Session Initiation Protocol “a very simple text-based application- layer control protocol to create, modify and terminate sessions such as Internet telephony and multimedia conferences with one or more participants.” geek [slang] – noun “a computer expert or enthusiast (a term of pride as self-reference, but often considered offensive when used by outsiders.)”
    3. 3. Who am I? PhD from Columbia University in 2006 on reliable, scalable & interoperable Internet telephony Student of Prof. Henning Schulzrinne who is co-inventor of SIP, RTP, RTSP Worked at Adobe with Dr. Henry Sinnreich who is considered God Father of SIP Worked on SIP since 1999, building prototypes, systems. Worked at Tokbox and 6Connex on web-based video communication Worked on open source projects p2p-sip, videocity, flash-videoio, restlite, rtmplite, siprtmp VoIP researcher Software architect Systems engineer
    4. 4. What will you learn? History Competition IETF, RFC Ecosystem References Dive-in Syntax Semantics Call flow Extensions Scripting Hands-on Jain-sipapi 39peers siprtmp Challenges On Web NAT traversal Security Audio quality Walled garden Voice Sampling Encoding Transport Motivation What is the big picture? How does everything fit together? Where to look for more information? 15 min 45 min 30 min 15 min 15 min = total 2hr
    5. 5. … – Digitization (e.g., sampling at 8kHz, 16 bits per sample, i.e, 128 kb/s or 320 bytes per 20 ms) – Real-time compression/encoding (e.g., G.729A at 8 kb/s) – Transport to remote IP address and port number over UDP (Why not TCP?) – Processing on receiver side is the reverse Audio packet transfer
    6. 6. +127 +0 -127 10101111…01101101 Sample at twice the highest voice frequency 2 x 4000=8000 Hz (interval of 125 µsec) Round off samples to one of 256 levels (introduces noise) Encode each quantized sample into 8 bit code word PCM: 8000 x 8 bits = 64 kb/s Other techniques (differential coding, linear prediction) 2.4 kb/s to 64 kb/s What is narrowband/wideband? What are frame vs sample-based codecs? Sampling, Quantization, Encoding
    7. 7. Voice codecs Codecs Bitrate kb/s Use cases G.711 64 Phone, PSTN G.729 8 VoIP, carriers G.723.1 5.3/6.3 VoIP, modem G.722.2 6-24 Wideband VoIP, mobile Speex 2-44 Good quality, free, Flash Player iLBC 13.3/15.2 Free, low bit-rate, SILK 6-40 Skype, open source iSAC 10-32 Global IP sound (GIPS), Gtalk
    8. 8. Unreliable UDP a) Packet loss b) Out-of-order (very rarely) c) Jitter (delay variation) 1 2 3 5 7 6 1 2 3 4 5 6 7 timeline Sender Receiver (a) (b) Problems with UDP
    9. 9. Playout buffer playout buffer while (true) { buf = read(au,20ms); //blocks if (!silence) sendto(remote, buf); … buf = get(20 ms); write(au, buf); } 20 ms packet microphone sendto(remote IP:port) read speaker 20 ms packet write get Received packet recvfrom() put while (true) { buf = recvfrom(...); // blocks put(buf); }
    10. 10. Sender Receiver Playout buffer 1 2 3 5 7 6 1 2 3 4 5 6 7 8 9 0 2 3 4 8 9 0 1 2 3 4 – What is the its purpose? – What should be the buffer size? – How to do adaptive delay adjustment? – Why do you need sequence number? 3 2 1 1 2 1 2 1 2 3 5 7 5 7 6 8 7 9 8 0 9 0 2 3 2
    11. 11. 1 2 3 4 1 2 3 4 Sender Receiver 5 6 7 5 6 7Silence … t1 t2 t3 t4 t5 t6 t7 t8 t9 Playout time vs packet loss detection Timestamp vs sequence number • Silence suppression • Variable length packets
    12. 12. Encoded Audio RTP Header UDP header IP header msg sendto(…, msg, …) recvfrom(…, msg, …) Sequence number Optional contributors’ list (CSrc) Source identifier (SSrc) Timestamp (proportional to sampling time) Payload typeCC MV P X Real-time Transport Protocol (RTP) RTP: media transport RTCP: QoS feedback 8 bits 8 bits 16 bits
    13. 13. RTP-based conference ssrc=5263 ssrc=7182 ssrc=2639 ssrc=9844Session identified using receive IP address + port
    14. 14. RTP-based conference Σ ⇔ Mixer Transcoder µ-law µ-law G.729 G.729 µ-law µ-law Mixer mixes multiple streams, and puts rtp.ssrcs of contributors in the mixed packet as rtp.csrc Transcoder converts one encoding to another. Typically to accommodate heterogeneous bandwidth links.
    15. 15. RTP FAQ – Who uses RTP? – Is RTP a transport or application protocol? – Is RTP secure? What are SRTP, ZRTP? – Is RTP header a big overhead? – Is RTCP needed for two-party voice calls?
    16. 16. Why do you need signaling? Alice Bob Sam Henry Bob=> Sam Henry=> Alice=> Where is Alice? INVITE for a call using µ-law and G.729 at OK using µ-law at 1. Locate destination user 2. Negotiate session parameters
    17. 17. What will you learn? History Competition IETF, RFC Ecosystem References Dive-in Syntax Semantics Call flow Extensions Scripting Hands-on jain-sipapi 39peers siprtmp Challenges On Web NAT traversal Security Audio quality Walled garden Voice Sampling Encoding Transport Motivation
    19. 19. Addressing Examples: “Alice Smith” <> sip:alice@;user=phone;transport=tcp tel:12125551234 tel:19172223333 What is address-of-record (AoR)? What is sips URI? What is tel URI? – RFC 3966
    20. 20. Alice Bob Lookup Two stage lookup similar to email: DNS: to locate server for a domain Database: locate user within a domain Jane$ dig –t naptr 3600 IN NAPTR 1 0 "s" "SIP+D2U" "" 3600 IN NAPTR 2 0 "s" "SIP+D2T" "" $ dig –t srv 3600 IN SRV 10 10 5060 3600 IN SRV 10 10 5060 $ dig –t a 3600 IN A How to do failover and load sharing?
    21. 21. INVITE SIP/2.0 Via: SIP/2.0/UDP;branch=z9hG4bK74bf9 Max-Forwards: 70 From: “Bob” <>;tag=9fxced76s1 To: “Alice” <> Call-ID: 384827@ CSeq: 1 INVITE Contact: <sip:bob@> Subject: How are you? Content-Type: application/sdp Content-Length: 151 ... SIP/2.0 200 OK Via: SIP/2.0/UDP;branch=z9hG4bK74bf9 From: “Bob” <>;tag=9fxced76s1 To: “Alice” <>;tag=8321234356 Call-ID: 384827@ CSeq: 1 INVITE Contact: <sip:alice@> Content-Type: application/sdp Content-Length: 147 ... Request Response What are WS and CRLF?
    22. 22. Requests Purpose Method Reference Establish a SIP session with offer/answer INVITE RFC 3261 Acknowledge a response to INVITE ACK Cancel a pending request CANCEL Terminate an existing SIP session BYE Query the capabilities of server or UA OPTIONS Temporarily bind AoR to device URI REGISTER Establish a session to receive updates SUBSCRIBE RFC 3265 Deliver updates in a subscribed session NOTIFY Upload status to a server PUBLISH RFC 3903 Ask another UA to act upon a URI REFER RFC 3515 Transport an instant message (IM) MESSAGE RFC 3428 Update session state information UPDATE RFC 3311 Acknowledge a provisional response PRACK RFC 3262 Transport mid-call signaling data INFO RFC 2976
    23. 23. Class Code Examples Provisional 1xx 100 Trying 180 Ringing 183 Session Progress Success 2xx 200 OK 202 Accepted Redirection 3xx 300 Moved 302 Multiple Choices Client error 4xx 400 Bad Request 401 Unauthorized 403 Forbidden 404 Not Found 486 Busy Here Server error 5xx 501 Not Implemented 503 Service Unavailable Global error 6xx 600 Busy Everywhere 603 Decline Responses
    24. 24. Name Purpose Reference Alert-Info Customized ringing RFC 3261 Record-Route Proxy will stay in SIP signaling path Supported List of supported extensions Replaces Call control, transfer, pickup RFC 3891 Join Conferencing RFC 3911 Reason Reason for failure response RFC 3326 Event Associated event package RFC 3265 Referred-By Third-party who initiated the request RFC 3892 Header examples
    25. 25. Example call flow INVITE INVITE INVITE 100 Trying 100 Trying 180 Ringing 180 Ringing180 Ringing 200 OK200 OK 200 OK ACK BYE 200 OK RTP media session UA Proxy Proxy UA SIP trapezoid What is an outbound proxy?
    26. 26. Transport • UDP – Most common – Low state overhead – But small max packet size • TCP: – Use with SSL – Large message bodies – NAT/firewall traversal – Connection setup overhead – Head of line blocking for trunks (use SCTP instead) • Transport reliability – Request retransmissions – Exponential back-off – INVITE vs non-INVITE Why is ACK needed for INVITE response?
    27. 27. Message routing Response follows the reverse request path – Via header in SIP message records the request path Request routing decision at each hop – Usually direct end-to-end transport after initial request – Forcing request path: Record-route and Route headers. – Request forking: parallel vs sequential (use q- value in Contact) – Caller and callee info: further govern request routing Via: Via: Via: Via: Via: Via: Via: Via: Via: Via: Via: Via: q=1.0 q=0.7 q=0.2 INVITE 302 moved 486 busy 200 OK
    28. 28. (3) invite (4) moved (5) Bob Example call setup (6) (6) (6) unavailable (7) Alice (8) (11) cancel (12) ok (13) (1) invite (2) moved (9) ok (10)
    29. 29. Elements • SIP user agent – IP phone, PC, conference bridge,… • SIP redirect server – returns new location for requests • SIP stateless proxy – routes call requests • SIP (forking) stateful proxy – routes call requests • SIP registrar – accepts name to address mapping • Location server – maintains name to address mapping • Maintaining state – stateless: each request and each response handled independently • Fast load balancing proxies; robust – (transaction) stateful: remember a whole request/response transaction • Enterprise servers, . . . – call stateful: remember a call from beginning to end • Billing, NAT traversal, . . . Typically implemented in a single software or box Other entities: outbound proxy, back-to- back user agent (b2bua), application-level- gateway (ALG), …
    30. 30. SIP FAQ – What is a SIP transaction? – What is a SIP dialog? – What is early media? – What is re-INVITE? – Why do you need record-route? – What is request forking?
    31. 31. AliceBob INVITE I can support µ-law and G.729 Send me audio at OK; I can support µ-law Send me audio at To port 8000 RTP To port 6780 RTP Session negotiation ACK How to modify media session?
    32. 32. INVITE SIP/2.0 ... Content-Type: application/sdp Content-Length: 151 v=0 o=bob 26172 27162 IN IP4 s=- c=IN IP4 t=0 0 m=audio 6780 RTP/AVP 0 8 5 97 98 a=rtpmap:97 iLBC/8000 a=rtpmap:98 telephone-event/8000 m=video 6790 RTP/AVP 31 Session Description Protocol (SDP) What is offer/answer? What is telephone-event?
    33. 33. SIP is…, SIP is not … • Core protocol for establishing sessions • Allows transport of session description • Allows change of parameters in mid-session • Terminate session • NOT for distribution of multimedia data • NOT suitable for media gateway control • . . . SIP applications typically fall in following categories:  setting up VoIP calls  setting up multimedia conferences  event notification => IM and presence  text and general messaging  signaling transport
    34. 34. What is the real value in SIP? Open system Advanced services
    35. 35. Telephony Call routing speed dial, call forwarding, “follow me”, filtering/blocking (in/out), do-not-disturb, distinctive ringing,… Call handling auto-answer, auto-attendant, voice-mail, … Multi-party call waiting, call transfer (blind/consultative), conference call, park, pickup, music-on-hold, monitoring, … what phone has is not enough! Internet Presence enabled place call only if callee is available, invite participants when all are online and not busy, … Unified messaging receive email, IM alert for new voice mail, or when someone joins your conference, … Web enabled click-to-call, web conference, view conference status and voice-mails on web, … Programmable services
    36. 36. Programming services If somebody is calling the third-time, allow mobile. Try office and home in parallel, if that fails, try home. Allow call to mobile if I’ve talked to person before. If on tele-marketing list, forward to dial-a-joke. Try office during day, and mobile in evening. …
    37. 37. Where do the services reside? Make call when boss is online … B2BUA Double ringing sound when boss calls… Endpoint Proxy/registrar Endpoint Service control in endpoint vs network? Forward to office phone during day, and home phone during evening… Enter your authentication PIN for billing… Use finger for locating user…
    38. 38. Endpoint call control • Language for End System Services (LESS) • Direct user interaction, direct media control • Handle converged information, e.g., call, presence, email Example: call a friend when he comes online <less name="online_call" require="generic presence ui"> <notification status="online" priority="0.5"> <address-switch field="origin"> <address is=“"> <call /> <alert sound=“" text="Calling …" /> </address> </address-switch> </notification> </less>
    39. 39. Network call control SIP-CGI RFC 3050, CPL, servlets SIP_FROM SIP_TO stdin CGI-PROXY-REQUEST stdout SIP proxy Urgent Low-priority Voicemail Phone if (defined $ENV{SIP_FROM} && $ENV{SIP_FROM} =~ / { foreach $reg (get_regs()) { print "CGI-PROXY-REQUEST $reg SIP/2.0n"; print "Priority: urgentnn"; } }
    40. 40. Call transfer Blind/consultation/attended active call REFER C Referred-By: B INVITE C Referred-By: B BYE A A B C active call
    41. 41. B2BUA and 3pcc Back-to-back UA – Incoming call triggers outgoing call Services – Calling card – Anonymizer INVITE A B C SIP SIP OK (SDP1) ACK INVITE (SDP1) OK (SDP2) ACK INVITE (SDP2) OK ACK
    42. 42. Voicemail SIP_FROM SIP_TO stdin CGI-PROXY-REQUEST stdout If no response Proxy controls accept after 15s Voicemail acts like a phone Redirect after 10s Endpoint based
    43. 43. VoiceXML Telephone PSTNPSTN Voice gateway Web server Service logic (CGI, servlet, JSP) 1. Voice and telephony functions 2. VoiceXML browser Internet userVXMLVXML HTMLHTML InternetInternet IVR platform • Voice and telephony functions (ASR, TTS, DTMF) • Service logic (application specific) VXML BrowserGateway
    44. 44. VoiceXML contd. <form action=“url”> Enter your Id: <input name=‘id’> <input type=‘submit’> </form> <form> <field name=‘id’> <prompt> Your ID, please. </prompt> </field> <block> <submit next=“url”/> </block> </form> Telephony, speech synthesis or audio output, user input and grammar, program flow, variable and properties, error handling, …
    45. 45. Interworking with telephone • Translating audio (µ-law/A-law) • Translating signaling (PRI/T1,ISUP) – Overlap signaling – Advanced features in SIP are lost in PSTN • Translating identifiers (phone number) • Determining transition points Telephone network SIP/PSTN gateway SIP server IP endpointTelephone subscriber +1-415-123-4567 IP to telephone Static mapping – 1-212854xxxx=> Gateway information is dynamic: – Overlapping networks, multiple providers, Load balancing Telephony routing over IP (TRIP) – Route advertisement, can be implemented in outbound proxy, suitable for current hierarchical network +1 at 4¢/min +1212 at 1¢/min +1212939 free Telephone to IP Gateway knows the SIP server – <sip:4567@gateway2.example .com;user=phone> ENUM – E164 numbering (using DNS) – +1 212 9397042 => => – Suitable for relatively “static” contacts
    46. 46. Summary of services • Call forwarding: basic INVITE behavior • Call transfer: REFER method • Call hold: set media to • Caller id: From, plus extensions • DTMF carriage: carry as RTP (RFC 2833) • Calling card: B2BUA + voice server • Voicemail: UA, proxy, media server • Programming: CGI, CPL, servlet, LESS, CCXML, VoiceXML, SECE, …
    47. 47. What will you learn? History Competition IETF, RFC Ecosystem References Dive-in Syntax Semantics Call flow Extensions Scripting Hands-on jain-sipapi 39peers siprtmp Challenges On Web NAT traversal Security Audio quality Walled garden Voice Sampling Encoding Transport Motivation
    48. 48. Hands-on exercises • Experiment with to understand SIP message format • Walk-through of JAIN SIP API reference implementation • Programming server with SIP express router (SER) • Walk-through of 39peers Python stack • Experiment with siprtmp for Flash-to-SIP call
    49. 49. What will you learn? History Competition IETF, RFC Ecosystem References Dive-in Syntax Semantics Call flow Extensions Scripting Hands-on jain-sipapi 39peers siprtmp Challenges On Web NAT traversal Security Audio quality Walled garden Voice Sampling Encoding Transport Motivation
    50. 50. JavaScript API
    51. 51. in action
    52. 52. Browser Cloud HTML × Device capture × Real-time codecs × E2E UDP media Flash Player × H.264 Video encoder × Echo cancellation × UDP media × Server socket Hosted & elastic – Utility billing – Programmable IVR – Conferencing – Recording/playback – Accounting – Tracking Phone – Voice, SMS, …
    53. 53. Peer-to-peer ≠ cloud computing • Self management • Free resource sharing • No central co-ordination • … • Self management • Utility computing • Central co-ordination • … managed
    54. 54. When to do P2P? if – most of the peers do not trust each other, AND – There is no incentive to help peers then – P2P does not evolve naturally to work See
    55. 55. NAT traversal Problem: Solutions: Smart servers – SER allows detecting nodes behind a NAT – Use application level gateway and media-proxy SIP Signaling – Symmetric response routing for UDP (rport) – Connection reuse for TCP/TLS (sip-outbound) Media – STUN: Simple traversal of UDP through NAT – TURN: Traversal using relay NAT – ICE: Interactive connectivity establishment L= REGISTER Contact: sip:alice@ . . . E= INVITE alice@
    56. 56. Interactive connectivity establishment 1. Address gathering 2. Negotiation 3. Connectivity check L= STUN server E= R= Gather addresses (L,E,R) (local) (external) (relay) Gather addresses (local) (external) (relay) INVITE (offer) OK (answer)
    57. 57. Security Spoofing “From” Snooping signaling Snooping media Billing confusion Denial-of-service attacks SPAM on IP telephony TLS, IPsec, auth, S/MIME, … SRTP, ZRTP, …
    58. 58. SIP is just a tool! How do you use it? Open house vs walled garden
    59. 59. What will you learn? History Competition IETF, RFC Ecosystem References Dive-in Syntax Semantics Call flow Extensions Scripting Hands-on jain-sipapi 39peers siprtmp Challenges On Web NAT traversal Security Audio quality Walled garden Voice Sampling Encoding Transport Motivation
    60. 60. IETF’s SIP End-to-end principle Inspired by Internet & web One task for one protocol ITU-T’s H.323 Managed “services” Legacy of telecom network Complex integrated spec IP and lower layers TCP UDP TPKT Q.931 H.245 RAS RTCP RTP Codecs Terminal Control/Devices Transport Layer SIP SDP RTP Codecs RTCP Terminal Control/DevicesJabber/XMPP Jingle for session initiation Re-uses many SIP features Proprietary Adobe’s RTMFP, Skype, Yahoo, MSN, Cisco’s Skinny, … Winner among VoIP carriers, mobile operators, and digital voice providers
    61. 61. Brief history of SIP  1996: avt to focus on real-time transmission over UDP  1998: mmusic to focus on Internet conferencing  1999: First SIP proposed standard RFC 2543 by M. Handley, H. Schulzrinne, E. Schooler, J. Rosenberg  1999: sip to focus on core development of SIP  2000: iptel to focus on routing and call processing  2000: enum to focus on DNS based phone numbers  2002: New SIP proposed standard RFC 3261, added authors  2002: sipping to focus on applications/extensions to SIP  2004: simple to focus on SIP-based IM/presence  2005: xcon to focus on centralized conferences  2005: behave to focus on NAT traversal for SIP, RTP  2006: p2psip to focus on peer-to-peer rendezvous for SIP
    62. 62. Specification in 140+ documents 1. RFC 3261: SIP: Session Initiation Protocol 2. RFC 4566: SDP: Session Description Protocol 3. RFC 3550: RTP: a transport protocol for real-time applications 4. RFC 3264: An offer/answer model with SDP 5. RFC 3840: Indicating UA capabilities in SIP 6. RFC 3263: Locating SIP servers 7. RFC 4474: Enhancements for authenticated identity management in SIP 8. RFC 3265: SIP-specific event notification 9. RFC 3856: A presence event-package for SIP 10. RFC 3863: Presence information data format (PIDF) 11. RFC 3428: SIP extension for instant messaging 12. RFC 3581: An extension of SIP for symmetric response routing 13. RFC 5248: Interactive connectivity establishment (ICE) 14. RFC 5389: Session traversal utilities for NAT (STUN) 15. RFC 5766: Traversal using relays around NAT (TURN) RFC 5638: Simple SIP usage scenarios for applications in the endpoints RFC 5411: A hitchhiker’s guide to SIP Basic Advanced Presence/IM NAT traversal
    63. 63. Why was SIP invented?  Invented as a “rendezvous” function  To locate another SIP device  To locate a SIP server to find another SIP device  To establish (separate) media session  To modify existing session  To express SIP device’s capabilities  To request 3rd party call control  To find status, capability, availability of another SIP device or user  To subscribe to receive future updates of status/availability  To exchange mid-session signaling data  To exchange short instant messages  To support provider’s closed network of “managed” services
    64. 64. Protocol jungle
    65. 65. VoIP activities IETF working groups – sip, sipping, mmusic, xcon, p2psip, simple, impp, iptel, enum, ecrit, avt, sigtran, midcom, … Elsewhere – 3GPP, ITU-T, W3C, Jabber/XSF, ETSI- Tiphon, IMTC, sip-forum, VON, …