SIP (Session Initiation Protocol) IntroductionSIP (Session Initiation Protocol) is a signaling protocol used to create, manage andterminate sessions in an IP based network. A session could be a simple two-waytelephone call or it could be a collaborative multi-media conference session. This makespossible to implement services like voice-enriched e-commerce, web page click-to-dial orInstant Messaging with buddy lists in an IP based environment. Dont worry if you dontknow about these services. You dont need to know them before you learn about SIP.SIP has been the choice for services related to Voice over IP (VoIP) in the recent past. Itis a standard (RFC 3261) put forward by Internet Engineering Task Force (IETF). SIP isstill growing and being modified to take into account all relevant features as thetechnology expands and evolves. But it should be noted that the job of SIP is limited toonly the setup and control of sessions. The details of the data exchange within a sessione.g. the encoding or codec related to an audio/video media is not controlled by SIP and istaken care of by other protocols. For an overview of the major SIP functions, click here.This introduction is meant for beginners. This beginners made easy tutorial is to give abrief introduction to SIP before one ventures into the long RFC documents. However, ifyou are a veteran please go through this short tutorial and suggest modifications.Here on this site the aim is not to make you an expert of SIP based applications. I doubtwhether any site can do that. You have to have hands on experience to muster the aspectsrelated to Internet multimedia or IP telephony. Here I am proposing nothing new. Thewhole job is to initiate a newcomer with the facets of the Session Initiation protocol (SIP)so that a near 200 page RFC document does not intimidate you. However I stronglyrecommend that you go through the document of RFC 3261 once you have completedthis tutorial.If you need a book that you can use to start with SIP, SIP Demystified is a goodoption. It starts with standard telephony systems and gradually guide you into SessionInitiation Protocol.We shall start with a little background history of SIP. If you are in a hurry, you can skipto the functions of SIP.
After going through the online tutorial, I recommend that you go through some of thebooks as your needs and interests are. You can visit the books section or directly checkthose available in amazon.com.A Brief History of SIPInitially only the traditional switch-based telephone system was the main medium fortransmitting messages. However with the advent of the Internet, the need was felt tofabricate a system, which connects people over the IP based network. Differentcommunities put forward different solutions but the solution presented by IETF wasfinally accepted as the most general one. However the development of SIP in IETF wasnot a one-step process.February 1996Initial Internet drafts were produced in the form of -Session Invitation Protocol (SIP) – M.Handley, E.SchoolerSimple Conference Invitation Protocol (SCIP) – H.SchulzrinneSIP was originally intended to create a mechanism for inviting people to large-scalemultipoint conferences on the Internet Multicast Backbone (Mbone). At this stage, IPtelephony didnt really exist. The first draft was known as "draft-ietf-mmusic-sip-00”. Itincluded only one request type, which was a call setup request. (Wondering what music isdoing in SIP??? Well, it is an acronym for Multiparty Multimedia Session Control. IETFpeople are not that music crazy after all.)December 1996A newer version “draft-ietf-mmusic-sip-01” was proposed as a modification to SIP-0.Still it was yet to take the shape of SIP as we know it now.January 1999The IETF published the draft called "draft-ietf-mmusic-sip-12". It contained the sixrequests that SIP has today.March 1999SIP published RFC 2543 as a standard.It was modified further to generate the more modern version of RFC 3261.
Lets leave the history to get older and concentrate on perhaps the most important part ofthis tutorial. Lets know about the functions of SIP.Functions of SIPSIP is limited to only the setup, modification and termination of sessions. It serves fourmajor purposes • SIP allows for the establishment of user location (i.e. translating from a users name to their current network address). • SIP provides for feature negotiation so that all of the participants in a session can agree on the features to be supported among them. • SIP is a mechanism for call management - for example adding, dropping, or transferring participants. • SIP allows for changing features of a session while it is in progress.All of the other key functions are done with other protocols.Yes! this does indeed mean that SIP is not a session description protocol, and that SIPdoes not do conference control. SIP is not a resource reservation protocol and it hasnothing to do with quality of service (QoS). SIP can work in a framework with otherprotocols to make sure these roles are played out - but SIP does not do them. SIP canfunction with SOAP, HTTP, XML, VXML , WSDL, UDDI, SDP and others. Everyonehas a role to play!With all that said, SIP is still one of the most important protocols. Better learn about theSIP components.Components of SIPEntities interacting in a SIP scenario are called User Agents (UA)User Agents may operate in two fashions - • User Agent Client (UAC) : It generates requests and send those to servers. • User Agent Server (UAS) : It gets requests, processes those requests and generate responses.Note: A single UA may function as both.
Clients:In general we associate the notion of clients to the end users i.e. the applications runningon the systems used by people. It may be a softphone application running on your PC or amessaging device in your IP phone. It generates a request when you try to call anotherperson over the network and sends the request to a server (generally a proxy server). Wewill go through the format of requests and proxy servers in more detail later.Servers:Servers are in general part of the network. They possess a predefined set of rules tohandle the requests sent by clients.Servers can be of several types - • Proxy Server: These are the most common type of server in a SIP environment. When a request is generated, the exact address of the recipient is not know in advance. So the client sends the request to a proxy server. The server on behalf of the client (as if giving a proxy for it) forwards the request to another proxy server or the recipient itself. • Redirect Server: A redirect server redirects the request back to the client indicating that the client needs to try a different route to get to the recipient. It generally happens when a recipient has moved from its original position either temporarily or permanently. • Registrar: As you might have guessed already, one of the prime jobs of the servers is to detect the location of an user in a network. How do they know the location? If you are thinking that users have to register their locations to a Registrar server, you are absolutely right. Users from time to time refreshes their locations by registering (sending a special type of message) to a Registrar server. • Location Server: The addresses registered to a Registrar are stored in a Location Server.Now that the components are ready, we need the SIP commands to make them work.Commands of SIP • INVITE :Invites a user to a call • ACK : Acknowledgement is used to facilitate reliable message exchange for INVITEs. • BYE :Terminates a connection between users • CANCEL :Terminates a request, or search, for a user. It is used if a client sends an INVITE and then changes its decision to call the recipient. • OPTIONS :Solicits information about a servers capabilities.
• REGISTER :Registers a users current location • INFO :Used for mid-session signalingIf you dont realise how the commands exactly work, dont worry. We will discuss theformat of some of the above SIP commands in more detail shortly.Its time to go through a typical SIP session so that you can appreciate what we havelearnt so far and what follows in our journey through SIP.A Typical Example of SIP sessionSIP signaling follows the server-client paradigm as used widely in the Internet byprotocols like HTTP or SMTP. The following picture presents a typical exchange ofrequests and responses. Please note that it is only a typical case and doesnt include allpossible cases.If you unfamiliar with terms like SIP phone or softphone, learn about VoIP phones.Better open it in a new window.Before understanding the methods, first you should understand the pictorial diagram.User 1 uses his softphone to reach the SIP phone of user2. Server1 and server2 help tosetup the session on behalf of the users. This common arrangement of the proxies and theend-users is called "SIP Trapezoid" as depicted by the dotted line. The messages appearvertically in the order they appear i.e. the message on top (INVITE M1) comes firstfollowed by others. The direction of arrows shows the sender and recipient of eachmessage. Each message contains a 3-digit-number followed by a name and each one islabeled by M and a serial number. The 3-digit-number is the numerical code of theassociated message comprehended easily by machines. Human users use the name toidentify the message.
Figure : SIP session example with SIP trapezoidThe transaction starts with user1 making an INVITE request for user2. But user1 doesntknow the exact location of user2 in the IP network. So it passes the request to server1.Server1 on behalf of user1 forwards an INVITE request for user2 to server2. It sends aTRYING response to user1 informing that it is trying to reach user2. The response couldhave been different but we will discuss the other types of responses later. If you arewondering how server1 knows that it has to forward the request to server2, just hold onfor a moment. We will discuss that while going through the registration process of SIP.Receiving INVITE M2 from server1, server2 works in a similar fashion as server1. Itforwards an INVITE request to user2 (note: Here server2 knows the location of user2. Ifit didnt know the location, it would have forwarded it to another proxy server. So anINVITE request may travel through several proxies before reaching the recipient). Afterforwarding INVITE M3 server2 issues a TRYING response to server1.The SIP phone, on receiving the INVITE request, starts ringing informing user2 that acall request has come. It sends a RINGING response back to server2 which reaches user1through server1. So user1 gets a feedback that user2 has received the INVITE request.User2 at this point has a choice to accept or decline the call. Lets assume that he decidesto accept it. As soon as he accepts the call, a 200 OK response is sent by the phone toserver2. Retracing the route of INVITE, it reaches user1. The softphone of user1 sends anACK message to confirm the setup of the call. This 3-way-handshaking
(INVITE+OK+ACK) is used for reliable call setup. Note that the ACK message is notusing the proxies to reach user2 as by now user1 knows the exact location of user2.Once the connection has been setup, media flows between the two endpoints. Media flowis controlled using protocols different from SIP e.g. RTP.When one party in the session decides to disconnect, it (user2 in this case) sends a BYEmessage to the other party. The other party sends a 200 OK message to confirm thetermination of the session.Was that a bit long? Need a break? Go, get it! You deserve a break after going throughsuch a long SIP session -:) When you get back, we will dive inside a SIP requestmessage.Request Message Format of SIPBack already! Well, lets continue.In the previous SIP session example we have seen that requests are sent by clients toservers. We will now discuss what that request actually contains. The following is theformat of INVITE request as sent by user1.INVITE sip:firstname.lastname@example.org SIP/2.0Via: SIP/2.0/UDP pc33.server1.com;branch=z9hG4bK776asdhds Max-Forwards: 70To: user2 <sip:email@example.com>From: user1 <sip:firstname.lastname@example.org>;tag=1928301774Call-ID: email@example.comCSeq: 314159 INVITEContact: <sip:firstname.lastname@example.org>Content-Type: application/sdpContent-Length: 142---- User1 Message Body Not Shown ----The first line of the text-encoded message is called Request-Line. It identifies that themessage is a request.Request-LineMethod SP Request-URI SP SIP-Version CRLF[SP = single-space & CRLF=Carriage Return + Line Feed (i.e. the character inserted when you press the"Enter" or "Return" key of your computer)]Here method is INVITE, request-uri is "email@example.com" and SIP version is 2.The following lines are a set of header fields. • Via:
It contains the local address of user1 i.e. pc33.server1.com where it is expecting the responses to come.• Max-Forward: It is used to limit the number of hops that this request may take before reaching the recipient. It is decreased by one at each hop. It is necessary to prevent the request from traveling forever in case it is trapped in a loop.• To: It contains a display name "user2" and a SIP or SIPS URI <firstname.lastname@example.org>• From: It also contains a display name "user1" and a SIP or SIPS URI <email@example.com>. It also contains a tag which is a pseudo-random sequence inserted by the SIP application. It works as an identifier of the caller in the dialog.• Call-ID: It is a globally unique identifier of the call generated as the combination of a pseudo-random string and the softphones IP address. The Call-ID is unique for a call. A call may contain several dialogs. Each dialog is uniquely identified by a combination of From, To and Call-ID. If you are in confusion click here.• CSeq: It contains an integer and a method name. When a transaction starts, the first message is given a random CSeq. After that it is incremented by one with each new message. It is used to detect non-delivery of a message or out-of-order delivery of messages.• Contact: It contains a SIP or SIPS URI that is a direct route to user1. It contains a username and a fully qualified domain name(FQDN). It may also have an IP address. Via field is used to send the response to the request. Contact field is used to send future requests. That is why the 200 OK response from user2 goes to user1 through proxies. But when user2 generates a BYE request (a new request and not a response to INVITE), it goes directly to user1 bypassing the proxies.• Content-Type: It contains a description of the message body (not shown).
• Content-Length: It is an octet (byte) count of the message body.The header may contain other header fields also. However those fields are optional.Please note that the body of the message is not shown here. The body is used to conveyinformation about the media session written in Session Description Protocol (SDP). Youmay continue your journey through SIP without worrying about SDP right now. Howeverit doesnt hurt to take a peep.Your SIP request is waiting to get a SIP response message.Response Message Format of SIPHere is what the SIP response of user2 will look like.SIP/2.0 200 OKVia: SIP/2.0/UDP site4.server2.com;branch=z9hG4bKnashds8;received=192.0.2.3Via: SIP/2.0/UDPsite3.server1.com;branch=z9hG4bK77ef4c2312983.1;received=192.0.2.2Via: SIP/2.0/UDP pc33.server1.com;branch=z9hG4bK776asdhds;received=192.0.2.1To: user2 <sip:firstname.lastname@example.org>;tag=a6c85cfFrom: user1 <sip:email@example.com>;tag=1928301774Call-ID: firstname.lastname@example.orgCSeq: 314159 INVITEContact: <sip:email@example.com>Content-Type: application/sdpContent-Length: 131---- User2 Message Body Not Shown ----Status LineThe first line in a response is called Status line.SIP-Version SP Status-Code SP Reason-Phrase CRLF[SP = single-space & CRLF=Carriage Return + Line Feed (i.e. the character inserted when you press the"Enter" or "Return" key of your computer)]Here SIP version is 2, Status-Code is 200 and Reason Phrase is OK.The header fields that follow the status line are similar to those in a request. I will justmention the differences • Via:
There are more than one via field. This is because each element through which the INVITE request has passed has added its identity in the Via field. Three Via fields are added by softphone of user1, server1 the first proxy and server2 the second proxy. The response retraces the path of INVITE using the Via fields. On its way back, each element removes the corresponding Via field before forwarding it back to the caller. • To: Note that the To field now contains a tag. This tag is used to represent the callee in a dialog. • Contact: It contains the exact address of user2. So user1 doesnt need to use the proxy servers to find user2 in the future.It is a 2xx response. However responses can be differnet depending on particularsituations. Learn about the different types of SIP responses.Response Types of SIPThe first digit of a Status-Code defines the category of response. So any responsebetween 100 and 199 is termed as a "1xx" response and so is done for any other type.SIP/2.0 allows six types of response. They are similar to those of HTTP. • 1xx: Provisional -- request received, continuing to process the request; • 2xx: Success -- the action was successfully received, understood, and accepted; • 3xx: Redirection -- further action needs to be taken in order to complete the request; • 4xx: Client Error -- the request contains bad syntax or cannot be fulfilled at this server; • 5xx: Server Error -- the server failed to fulfill an apparently valid request; • 6xx: Global Failure -- the request cannot be fulfilled at any server.If a response is received having a Status-Code of the form yxx which is not understoodby the receiving party, it treats the response as a y00 response i.e. if a client receives anunknown response 345, it treats that as a 300 response. An unknown 1xx is treated as 183(Session in Progress). So each UA must know how to react to 100,183,200,300,400,500and 600.In SIP we talk about calls, dialogs, transactions and messages. Frankly, I was prettyconfused initially about how they are related. The next page clarifies their inter-relation.
Relation among Call, Dialog, Transaction & MessageIf you are confused with the relation among Call, Dialog, Transaction & Message, youare not alone. I think quite a good number of people get confused regarding the relationin the beginning.Messages are the individual textual bodies exchanged between a server and a client.There can be two types of messages. Bingo! You already know them ... Requests andResponses.Transaction occurs between a client and a server and comprises all messages from thefirst request sent from the client to the server up to a final (non-1xx) response sent fromthe server to the client. If the request is INVITE and the final response is a non-2xx, thetransaction also includes an ACK to the response. The ACK for a 2xx response to anINVITE request is a separate transaction.Dialog is a peer-to-peer SIP relationship between two UAs that persists for some time. Adialog is identified by a Call-ID, a local tag and a remote tag. A dialog used to be referredas a call leg.Call of a callee comprises of all the dialogs it is involved in. I think a Call is same as aSession.
The following figure will make the relation clearer.(RINGING is a 1xx response and OK is a 2xx response.)A caller may have connections to a number of callees at a time forming a number ofdialogs. All these dialogs make a single call.Well, time to reveal a old secret! If you want to know how server1 knew the location ofuser2 during the call setup, the page about SIP registration will help you.Registration in SIPWhile going through a typical SIP session you have already seen that the caller doesntknow the address of the callee initially. The proxy servers do the job of finding out theexact location of the recipient. What actually happens is that every user registers itscurrent location to a REGISTRAR server. The application sends a message calledREGISTER informing the server of its present location. The Registrar stores thisbinding (between the user and its present address) in a location server which is used byother proxies to locate the user.
User yy uses the IP 22.214.171.124 as its current location and registers it with the server.This actually helps in user mobility. Say there is a messaging application. You can log infrom different computers. As soon as you log in using your username, the applicationREGISTER the username with the IP of that computer. The Expire field reflects theduration for which this registration will be valid. So the user has to refresh its registrationfrom time to time.Please note that the difference between a proxy server and a registration or a locationserver is often only logical. Physically they may be situated on the same machine.Wow!! You have completed the whole of the SIP tutorial. Congratulations! I insist thatyou go through the conclusion. It has important information to move forward in your SIPeducation.ConclusionI hope by now you have got a basic idea of what SIP is and what it does. You should beable to recognize the major components in a SIP scenario and how different messages areexchanged to establish and terminate sessions. But you must remember that it is just thebeginning. You should go through the document of RFC 3261. If you are serious aboutyour learning better get your hands on a book as recommended in the books section.You should go through the other sections of the site - • Introduction to RTP : RTP manages the realtime transmission of audio/vedio data in a session. • Introduction to SDP : SDP is used for describing a session needed for establishing and sustaining a session. • VoIP : VoIP is the technology to transmit voice over an IP network. Its an emerging area you would like to know about.I encourage you to go through the resources listed in internet multimedia resources page.
I intend to include some more pages regarding header fields and proxy servers in nearfuture. So keep coming back. If you have any query or suggestion and more importantlyif have found any mistakes in the tutorial, please feel free to email me firstname.lastname@example.org.RTP IntroductionThis introduction is meant for beginners. This beginners made easy tutorial is to give abrief introduction to RTP before one ventures into the long RFC documents. However, ifyou are a veteran please go through this short tutorial and suggest modifications.What is RTP?Real-time Transfer Protocol (RTP) provides end-to-end delivery services for data (suchas interactive audio and video) with real-time characteristics.It was primarily designed to support multiparty multimedia conferences. However it isused for different types of applications which we will go through shortly.RTP is a standard specified in RFC 1889. More recent versions are RFC 3550 and RFC3551. For an introduction like this we will stick to RFC1889Real Time aspect of RTPWhat is meant by real-time?The class of methods whose correctness depends not only on whether the result is thecorrect one, but also on the time at which the result is delivered.To make things simpler, lets take an example. Say you want to listen to a song. When youare downloading it from a site, you dont care whether it is downloaded at the same rateor not. You just need a reliable download (preferably fast -:)). But what if want to listento the song without downloading it? Then you are not only interested to get the wholedata but also the rate at which you receive, otherwise the song loses its charm. Here youneed a real-time transmission.Note that the example is given only to show how the time-factor is important in transmission of data. Real-time transmission is more important in multimedia conferences.RTP gives No Guarantee for timely Delivery
Confused?? I bet you are!Well, the point is that RTP itself does not provide any mechanism to ensure timelydelivery or provide other quality-of-service guarantees. It relies on lower-layer services(e.g. UDP, TCP) to do so. The dependence will be clearer when we discuss the RTPpacket structure.So how come is it called a real-time protocol?RTP provides suitable functionality for carrying real-time content, e.g., a timestamp andcontrol mechanisms for synchronizing different streams with timing properties. We willdiscuss those in more detail soon.Components of RTPBefore going into the detailed structure of RTP, you should know that RTP is basically acombination of two parts - • Real Time Protocol (RTP): It carries real-time data. • Real Time Control Protocol (RTCP): It monitors the quality of service and conveys information about the participants.We will go through RTP first and then discuss RTCP. Both play important roles in thetransmission. Here you should note that the data and control messages are separated inthe forms of RTP and RTCP.Applications of RTPThe applications in which RTP plays an important role can be classified as follows -Simple Multicast Audio ConferenceInitially the owner of the conference (say the leader of a group) through some allocationmechanism obtains a multicast group address and pair of ports. One port is used for audiodata, and the other is used for control (RTCP) packets. This address and port informationis distributed to the intended participants. If privacy is desired, the data and controlpackets may be encrypted, in which case an encryption key must also be generated anddistributed.
Each participant sends the audio data in small chunks (say 20ms) or packets. Thestructure of the packets will be discussed later.Each instance of the audio application (i.e. each participant) in the conferenceperiodically multicasts a reception report plus the name of its user on the RTCP (control)port. This helps to monitor quality of transmission and also determine who the presentparticipants are.Audio and Video ConferenceIf both audio and video media are used in a conference, they are transmitted as separateRTP sessions RTCP packets are transmitted for each medium using two different UDPport pairs and/or multicast addresses. The canonical name or CNAME of individualparticipants are used to match the audio and video sessions. We will CANME whendiscuss functions of RTCP.The sessions are divided so that a participant may choose only one of them. If there islecture going on, you can just listen to the professor without having to see his face -:).Mixers and TranslatorsSo far, we have assumed that all sites want to receive media data in the same format.However, this may not always be appropriate. For users having connections of differentbandwidth or those working behind a firewall which wont allow IP packets to pass willneed some extra processing. This is done in the form of mixers and translators. We willdiscuss them briefly in the next two pages.Mixer in RTPIt may so happen that all participants in a conference do not have the connection of samebandwidth. So how do they take part simultaneously?One solution is that all of them use a lower bandwidth. But this leads to reduced-qualityaudio encoding.A smarter solution exists in the use of a RTP-level relay called a mixer. A mixer may beplaced near the low-bandwidth area. This mixer resynchronizes incoming audio packetsto reconstruct the constant 20 ms spacing generated by the sender, mixes thesereconstructed audio streams into a single stream, translates the audio encoding to a lower-bandwidth one and forwards the lower-bandwidth packet stream across the low-speedlink. The following figure gives a graphical representation -
The mixer puts its own identification as the source (SSRC) of the packet and puts thecontributing sources in CSRC fields. If you dont know about SSRC and CSRC, comeback to this paragraph after going through the RTP header structure.Mixers have other uses too. An example is a video mixer that scales the images ofindividual people in separate video streams and composites them into one video stream tosimulate a group scene.Translator in RTPA problem occurs if one or more participants of a conference are behind a firewall whichwont allow an IP packet containing the RTP message to pass. For this situationtranslators are used.Two translators are installed, one on either side of the firewall, with the outside onefunneling all multicast packets received through a secure connection to the translatorinside the firewall. The translator inside the firewall sends them again as multicastpackets to a multicast group restricted to the sites internal network. The following pictureillustrates it -
Translator do not change SSRC or CSRC fields unlike mixers. If you dont know aboutSSRC and CSRC, come back to this paragraph after going through the RTP headerstructure.Translators can be used for other purposes too e.g. to connect of a group of hostsspeaking only IP/UDP to a group of hosts that understand only ST-II.Packet Structure of RTPThe structure of a RTP packet is shown below.The real-time media that is being transferred forms the RTP Payload. RTP headercontains information related to the payload e.g. the source, size, encoding type etc. Wewill go through the header structure in the next page.However the RTP packet cant be transferred as it is over the network. For transferringwe use a transfer protocol called User Datagram Protocol (UDP). We wont discuss UDPheader.To transfer the UDP packet over the IP network, we need to encapsulate it with a IPpacket. We wont discuss IP header either. To transfer the IP packet over the physicalnetwork even the IP packet is sent within other packets. Those are not shown here.Header Structure of RTPThe following figure shows the RTP header structure -
• version (V): 2 bits This field identifies the version of RTP. The version is 2 upto RFC 1889.• padding (P): 1 bit If the padding bit is set, the packet contains one or more additional padding octets at the end which are not part of the payload. The last octet of the padding contains a count of how many padding octets should be ignored. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packets in a lower-layer protocol data unit.• extension (X): 1 bit If the extension bit is set, the fixed header is followed by exactly one header extension.• CSRC count (CC): 4 bits The CSRC count contains the number of CSRC identifiers that follow the fixed header.• marker (M): 1 bit Marker bit is used by specific applications to serve a purpose of its own. We will discuss this in more detail when we study Application Level Framing.• payload type (PT): 7 bits This field identifies the format (e.g. encoding) of the RTP payload and determines its interpretation by the application. This field is not intended for multiplexing separate media.• sequence number: 16 bits The sequence number increments by one for each RTP data packet sent, and may be used by the receiver to detect packet loss and to restore packet sequence. The initial value of the sequence number is random (unpredictable).• timestamp: 32 bits The timestamp reflects the sampling instant of the first octet in the RTP data packet. The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations.• SSRC: 32 bits The SSRC field identifies the synchronization source. This identifier is chosen randomly, with the intent that no two synchronization sources within the same RTP session will have the same SSRC identifier.
• CSRC list: 0 to 15 items, 32 bits each The CSRC list identifies the contributing sources for the payload contained in this packet. The number of identifiers is given by the CC field. If there are more than 15 contributing sources, only 15 may be identified. CSRC identifiers are inserted by mixers, using the SSRC identifiers of contributing sources.Synchronization in RTPThe receiver needs three key information for synchronization - the synchronizationsource, packets in order and sampling instant of packets which it gets from three headerfields. You must know about the header fields first.Synchronization Source (SSRC)The receiver may be receiving data from several sources. So for proper arrangement itneeds to identify the source of individual packets which is possible from the SSRC field.Sequence NumberIt is not enough to identify the source, the order is important too. The sequence numberincrements by one for each RTP data packet sent, and may be used by the receiver todetect packet loss and to restore packet sequence. The loss or out-of-order delivery occursdue network problems.TimestampFor media delivery not just the order of the packets but also the sampling instant ofindividual packets are important. Please go through the following paragraph carefully.Several consecutive RTP packets may have equal timestamps if they are (logically)generated at once, e.g., belong to the same video frame. Consecutive RTP packets maycontain timestamps that are not monotonic if the data is not transmitted in the order it wassampled, as in the case of MPEG interpolated video frames. (The sequence numbers ofthe packets as transmitted will still be monotonic.) So the sequence number is not enoughfor synchronization.You already know that in a audio/video session audio and video data are transmittedusing separate channels (if you dont know this, please go through applications of RTP).The receiver matches the video data with corresponding audio data using timestamp.
Application Level Framing in RTPRTP is a protocol framework that is deliberately not complete. It is not steadfast incertain structures and can be modified in a way to suit a specific application. RTP isintended to be malleable to provide adequate functionality. This characteristic is knownas Application Level Framing and is an important aspect of RTP.So a profile specification document is needed for each application to specify the wayRTP is used e.g. to define extensions or modifications to RTP that are specific to aparticular class of applications. Participants in a RTP session should agree to a commonformat. Several header fields can be manipulated according to a specific application.The extension bit may be set to indicate that the fixed header is followed by exactly oneheader extension. Extra fields may carry extra information useful for the usingapplication.The interpretation of the marker is defined by a profile. It is intended to allow significantevents such as frame boundaries to be marked in the packet stream. A profile may defineadditional marker bits or specify that there is no marker bit by changing the number ofbits in the payload type fieldA profile also specifies a default static mapping of payload type codes to payloadformats.RTCPWhat is RTCP?The RTP control protocol (RTCP) is based on the periodic transmission of controlpackets to all participants in the session, using the same distribution mechanism as thedata packets.Functions of RTCP • It provides feedback on the quality of the data distribution. Different types of packets are used. We will discuss those in the next page. • It carries a persistent transport-level identifier for an RTP source called the canonical name or CNAME. SSRC may change from time to time but CNAME
remains the same. It is used to identify a participant during the session. RTCP may also contain extra information for the participants like email. • By having each participant send its control packets to all the others, each can independently observe the number of participants. This number is used to calculate the rate at which the packets are sent. More users in a session means an individual source may send packets less frequently.Types of RTCP packets • SR: Sender report, for transmission and reception statistics from participants that are active senders • RR: Receiver report, for reception statistics from participants that are not active senders • SDES: Source description items, including CNAME • BYE: Indicates end of participation • APP: Application specific functionsConclusion of RTPYou should understand that this is only the tip of the iceberg. If you just needed anintroduction, it is OK to stop here. But for bigger things you must go through RFC 1889and that is not enough. You have to work yourself to be a master in applicationsemploying RTP.RFC 1889 has been superceeded by RFC 3550. Thanks to John York for pointing it out.At this point, I will strongly recommend that if you are serious about the subject pleasego through some of the books listed in the books section.If you have any suggestion, correction, query just mail to