LiSP: A layered P2PSIP-based architecture for live video streaming with ﬂexible
application logic placement
Victor Pascual, Carlos Maci´ n
Networking Technology and Strategies Research Group (NeTS)
Universitat Pompeu Fabra
Passeig de Circumval.laci´ , 3 08003 Barcelona (Spain)
Abstract For technological as well as commercial reasons, then,
video streaming has been a very active research topic as of
Internet live video streaming is an blooming technology. lately. Specially live streaming, with its additional real-time
To accomodate bandwidth consumption, several architec- constraints has been a very challenging topic, for a number
tures based on P2P principles exist. An important draw- of reasons.
back is their incompatibility with the standard in multime- First, video streaming, if it is to compete with classi-
dia transport control, SIP, mainly due to its client-server na- cal TV broadcasting, has to scale to very large audiences.
ture. Recently the IETF has started the design of a P2P ver- In spite of tremendous advances in video compression and
sion of the SIP protocol, P2PSIP. In this paper, a live video coding, an average quality, fullscreen video stream needs
streaming architecture based on P2PSIP and standard SIP anwhere from 512 Kbps to 2 Mbps as a minimum; much
is presented. The architecture is divided in three layers more if it is HDTV quality. Given the lack of support for
(Users, Peers and Applications) and can be integrated with native multicast in IP networks, and particularly in the Inter-
any other SIP-based service. Furthermore, the application net, bandwidth consumption at the source for unicast distri-
logic can be distributed across the layers. This permits bution becomes prohibitive. Second, managing large num-
an implementation in which the operator provides and bills bers of dynamically joining and leaving customers, spe-
for the service, or a more endpoint-centric implementation. cially if some form of video relaying or forwarding for
Both cases are presented here, together with an estimation bandwidth saving is employed, implies a tremendous man-
of the system’s complexity and the signalling load involved. agement burden and potentially impacts severly the quality
of experience. Although buffering can alleviate the prob-
lem, its use in live streaming is necessarily very restricted.
Third, as a side effect of the impact of churn, zapping
1. Introduction (which implies unsubscribing from a certain channel and
subscribing to another one) is very slow. The management
burden associated with it and the initialization of the sys-
In spite of the growing salience of computer games, web
tem for the new channel all bring zapping speed down to
browsing and other network-centric and PC-centric expe-
anywhere from several seconds to several tens of seconds.
riences, TV is still the most prominent element in house-
Last but not least, the perceived quality of video streaming
hold leisure. In recent years, however, video streaming over
over the Internet is still poor, with chunky video and even
IP networks has emerged as a technologically feasible, al-
loss of continuity. Although this might be a worthy price to
beit still immature, contender for classical TV broadcast-
pay for free TV, especially when accessing content unavail-
ing. This advance is of tremendous commercial relevance,
able otherwise, it is certainly not ripe for mass commercial
for it opens the door for true multimedia integration in the
home around the PC (or other networked appliances sim-
ilar in their capabilities) and the displacement of classical Peer-to-peer technology has been proposed to solve
TV broadcasting as a second-best experience. For network many of those problems. By using the video consumers
operators and service providers alike, eager to augment the as also relay elements, the bandwidth consumption at the
value added of their networks and networked products, this source can be drastically reduced and the scalability of the
is a huge opportunity. architecture improved: With every new viewer contributing
its upload bandwidth to the overall system, the capacity of reason why it seems a bad idea to resort to proprietary pro-
the network actually grows with the number of users, which tocols for video streaming is the advances being made in the
is a requirement for large audiences. Besides, users can be IETF P2PSIP WG in designing a fully compatible, P2P ver-
somewhat protected from the effect of churn, provided that sion of SIP. Although the protocols themselves are still in
every viewer can contact more than one relaying station si- the making, the principles and main characteristics of them
multaneously. Should one of the stations change to another are already known: P2PSIP will make use of DHTs for re-
channel, the total download bandwidth will decrease mo- source storage and localization, beginning with Chord but
mentarily, but not be cut out altogether. remaining open to other implementations. The DHT will
However, P2P does not solve the issue of the long startup store location and identiﬁcation information for members of
and zapping delays, and can actually even make it worse. a certain community (like, e.g., viewers of a video channel),
Since the stream now comes to a viewer after traversing a as well as information concerning their capabilities and sta-
number of relaying stations, additional delay is introduced. tus. They will have built-in NAT traversal capabilities and
Furthermore, a number of additional problems appear in the service discovery mechanisms. A number of drafts have
P2P context. Assuming that a number of relay stations exist, been published based on , where the details of the forth-
an algorithm is needed to decide from which of them to (si- coming protocol can already be found. Hence, it is possible
multaneously and cooperatively) download the video, and to use those drafts, together with the original SIP protocol,
also which parts of the video to download from every one as a basis to develop a fully IETF compliant architecture for
of them. The reverse is also true: Since upload bandwidth live video streaming around SIP.
is typically scarce in ADSL environments, an algorithm is
also needed to decide to whom to stream the solicited video
pieces, if more requests arrive than can be accepted. All In this paper, LiSP (Live Streaming over P2PSIP), a
in all, P2P live streaming solutions generally also present novel video live streaming architecture built around the lat-
high management complexity and burden in order to keep est developments in the P2PSIP WG and SIP, is presented.
the information about sources, channels, videos, relays, etc. It is composed of an structured overlay (i.e., based on a
updated and distributed among the participating nodes. DHT) for user and video management and a partial mesh
of video sessions for data delivery, in which all viewers act
Nevertheless, P2P technology is a very promising av- also as media relays. The P2PSIP Peer protocol is used for
enue, which is producing a number of breakthroughs in live the former and SIP for the later. The emphasis of the pa-
streaming. But to date, all the proposed architectures have per lies in a detailed description of the protocols involved,
failed to address the issue of true integration with the de together with a complexity analysis of the overall architec-
facto standard protocol in multimedia transmission in the ture. Furthermore, the architecture can be adapted to dif-
Internet: SIP. To the best of our knowledge, all P2P live ferent functional distributions at the service level, without
streaming architectures to date resort to proprietary proto- changing its basic structure. Here, we present two extreme
cols for overlay management and in some cases, also for examples: An endpoint-centric approach, in which all ap-
the management of video sessions. We think that it is a bad plication logic resides at the end nodes themselves, and
idea for two reasons: First, video streaming in the Inter- the overlay provides only a very basic support to the video
net based on standard multimedia distribution and manage- streaming application. This is the ”classic” P2P approach.
ment protocols opens the door to a new dimension of con- A second, network-centric approach, in which the overlay
verged multimedia services. Although many platforms exist takes a dominant role in the application is also introduced.
for concurrent text and audio interchange, like Microsoft’s This mode of operation is specially appealing for network
Messenger or Skype, which even support videoconferenc- operators eager to increase the value of their networks by
ing, they remain isolated communication devices. They providing a link to the ﬁnal applications. However, in this
lack so far the integration with other web-based services and paper only the ﬁrst scenario will be presented in detail.
ways of expression like blogs, social communities, etc. and
also with non-interactive media distribution, like, precisely,
video broadcasting, Internet radio, etc. Since most audio The rest of the paper is structured as follows: Section 2
and chat communication platforms rely on SIP, it would be reviews recent advances in the ﬁeld of live streaming over
incredibly advantageous to integrate video streaming on the IP networks. Section 3 presents the main elements of our
same set of protocols. architecture. Section 4 describes the different scenarios rel-
It can be argued that SIP follows the client/server ative to functional distribution that are possible. Section 5
paradigm and hence can not be integrated with P2P tech- focuses on the endpoint-centric scenario dealt with in this
nologies in video streaming solutions. Besides, SIP only article, which is evaluated in Section 6. Finally, section 7
deals with session negotiation and not with seed localiza- summarizes the ﬁndings of the article and gives some hints
tion, channel information distribution, etc. But the second for further work.
2. Related Work management, but also for overlay management: Since SIP
was neither developed for intra-DHT communication, nor
Live video streaming based on P2P presents a number for overlay management, they have to severely twist the SIP
of challenges. Some of them are general to broadcasting procedures and the structure and meaning of its messages,
a broadband video signal to a large audience , while substantially altering its spirit. Clearly, for overlay manage-
others are speciﬁc to live streaming . In general, most ment a protocol compatible with SIP is necessary, but using
initiatives so far try to leverage P2P technology with the SIP itself seems odd. The IETF P2PSIP WG is providing
goal of making the system more scalable by reducing band- an answer to this.
width consumption at the source, by relaying the content to In our design, we follow the P2PSIP WG use of the struc-
successive watchers. The ﬁrst such works used application- tured approach for overlay management, combined with a
level multicast for data transmission , . Multicast, partial mesh for data distribution. We rely on existing algo-
however, is based on building trees to distribute data from rithms for chunk scheduling and relay and request selection.
a source to many destinations. A tree is inherently fragile, The introduction of layering is fully compatible with our
for the loss of an interior node impacts on all subsequent architecture, but it is left for further study. Table 1 summa-
subtrees. To increase system robustness in the presence of rizes the main architectural properties of our design, com-
churn, multi-trees have also been explored, so that leaves in pared with the works of our predecessors.
one node are also interior nodes in different trees . The Architecture Overlay Manage- Data Distribution Protocol
information is divided into different layers, and every layer ment Structure Mechanism
is sent through a different tree. Only by subscribing to all AnySee Unstructured Multitree Proprietary
CoolStreaming Unstructured Mesh Proprietary
trees can the original signal be fully recovered, but the loss PULSE Unstructured Mesh Proprietary
of a layer only degrades the quality, without impeding the Chainsaw Unstructured Multicast Proprietary
SplitStream Structured Multitree Proprietary
visioning. Together with Forward Error Correction mech- MPSS Structured Multitree SIP
anisms , streaming becomes much more robust in the SOSIMPLE Structured Multitree SIP
LiSP Structured Mesh P2PSIP Peer
face of errors and churn. Alternatively, additional branches Protocol + SIP
can be introduced in the tree in order to augment its robust-
ness . Other proposals eliminate trees altogether and Figure 1. Main architectural properties of rel-
build partial or full meshes for data distribution   evant works
, . These proposals typically divide the information
in small pieces called chunks, so that parallel downloading
from multiple sources can be scheduled.
A related question is how to manage the distribution of 3. Architecture
information about membership, video availability, relaying,
etc. This overlay management is typically done in one of The main emphasis of this work lies in the use of P2P
two ways: Either with the help of a DHT (the structured principles for the transmission of live video streams over the
approach), or by relying on gossip-based protocols (the un- Internet, with the clear goal of maximizing the scalability
routing efﬁciency, they are also more complex. mizing the bandwidth usage (from the point of view of the
By introducing multiple layers and/or chunks, and by emitting source). The particular constraints of live broad-
turning intermediate viewers into relaying stations, a live casting impose also a strong interest in advanced video cod-
streaming design must ﬁnd an answer to a number of new ing and chunk selection techniques that minimize delay and
questions: How to choose from whom to download? How provide enhanced robustness against packet loss and node
to choose what to download from each chosen relay? How churn. LiSP is based on the joint use of Session Initiation
to choose to whom to relay? These questions require the Protocol (SIP) by the viewers and Service Extensible Proto-
design of appropriate selection and scheduling algorithms col (SEP) inside of the operator’s network, serving hence
, , . as a use case description for the P2PSIP peer protocol pro-
In spite of the diversity of approaches taken, they all posed to the IETF P2PSIP WG. Since the basic properties
have something in common: The use of proprietary pro- of the SEP draft are similar to the other proposals being
tocols for both the overlay management and the data distri- discussed, our architecture and the subsequent analysis and
bution. Speciﬁcally, none of the referenced works uses SIP conclusions can safely be generalized to them, too.
for session management.  and , on the other hand, do The architecture presented here supposes a core network
use SIP as the central protocol in their designs. However, formed by peer nodes at the service level. Peers participate
their approach seems questionable in at least two respects, in overlay network, they are overlay-routing nodes, and at
since they do not only use SIP for end-to-end data session the same time a peer contributes its storage capacity to any
other peer in the overlay network. Peers must support at among themselves for overlay management and channel in-
least the overlay maintenance, routing, and storage func- formation distribution and retrieval.
tions. These nodes are responsible for storing and manag-
ing information about the channels being emitted as well as
the nodes currently (re-)transmitting them. Other important
information will also be stored in the core network depend-
ing on the application scenario, as will be explained shortly.
The peer nodes communicate among them by using the SEP
peer protocol, which conforms to the deﬁnition of P2PSIP
Peer protocol in .
Like the P2PSIP Peer Protocol deﬁnition, SEP is based
on a DHT algorithm and not only maintains the overlay
topology, but also provides distributed database service.
SEP uses a ﬂexible packet forwarding mechanism so that
peers could choose the best peer to route the packet further.
It also provides a common method for service discovery,
i.e. to discover which peers could provide a speciﬁc ser-
vice. Some of these additional services may be required
to allow the overlay to form and operate, while others may
be enhancements to the basic P2PSIP functionality. The
routing modes taken by the SEP attempts to make the trans- Figure 2. LiSP layered architecture
action with lower latency and higher success rate even if the
intermediate peers fail or NATs are between the source and
the destination peers. However, the amount of information stored and pro-
cessed by the overlay versus the users varies, depending
In short, overlay peers form a ring of nodes, synchroniz-
on the functional distribution among peers and consumers.
ing their information based on SEP and a DHT algorithm.
Quite obviously, this distribution presents a number of
However, every peer shall retrieve and keep information lo-
trade-offs in terms of processing, storage and signaling load
cally relevant to users connected to it.
for both kinds of nodes. Different arrangements may be de-
LiSP users are non-overlay nodes which implement a sirable, depending on the application scenario, number of
SIP client (UA). They use SIP as deﬁned in the RFC3261 participating nodes (both consumers and peers), node capa-
and its associated extensions and are not aware if behind bilities, etc. Three different arrangements have been iden-
their responsible peer (named SIPeer since it acts as a SIP tiﬁed: One in which consumers are responsible for most
Server), lookup operations are performed using a P2P or of the processing, while the overlay provides only a loca-
a C/S topology. A LiSP user can adopt three different tion service (named Consumer does most), and another one
and non-exclusive roles: Consumer, Seed and Media Re- in which the overlay is charged with most of the manage-
lay. Consumers are the viewers of the video channels. They ment of the video streaming platform (named Overlay does
express their desire to watch a channel to their responsi- most). In between, a third hybrid scenario tries to balance
ble SIPeer, which will communicate the information about the load supported by both kinds of nodes. In this work,
which nodes are currently (re-)transmitting the channel to only the ﬁrst arrangement is described in detail, leaving the
the consumer. The consumer will then be in a position to others for further work.
connect to one or more of those relays to download the
video data. Media Relays, on their side, are users which
may be acting as consumers but are also relaying the video 4. Scenarios
session to other consumers. Seed nodes are the original and
unique media sources, which may be a television camera, a 4.1. Consumer does most
video server or even a webcam which presents SIP capabil-
ities. In the Consumer does most scenario, the overlay is re-
As said, LiSP users use a standard SIP interface to com- sponsible only of storing and managing information related
municate with their SIPeer, creating hence a structured, to the location of the participating peers and about the list of
two-layered architecture, in which consumers use SIP to available channels in the platform, as well as which nodes
communicate with the overlay to locate nodes transmitting are relaying their content. Hence, most of the tasks associ-
their desired content, and also among themselves for the ated with live streaming, like the decision processes about
purpose of session establishment, while the peers use SEP which chunks to download, and from whom, reside in the
consumers themselves. This architecture presents, as a con- only at the corresponding node for every seed/relay). The
sequence, a lightly loaded overlay, both in terms of sig- overlay will perform all the processing functions previously
nalling and processing. The consumers, on the other hand, done at the consumer in a distributed manner: It will decide
must have an important processing capacity and participate which nodes shall be chosen as relays for every new viewer,
heavily in the signalling process. As stated, the peer are it will schedule the transmission of chunks and it will keep
organized according to the SEP protocol using a underly- updated the list of channels and viewers. Hence the de-
ing DHT based on Chord. Hence, for communication pur- nomination Overlay does most (of the work). By keeping
poses they form a logical ring. On the one hand, one of the the state relative to all broadcasted media, the overlay takes
main advantages of the Consumer does most scenario’s ar- over a burden which reduces its scalability, due to the in-
chitecture is that the main logic of the application (which crease in signalling. It must be highlighted, however, that
chunks to retrieve, from whom, etc.) resides at the con- every peer only keeps track of channels being viewed by
sumer. This allows for seamless application upgrades, in- at least a consumer for which it plays the role of corre-
cluding new source coding methods, etc. As long as the sponding node. It also keeps the Buffer Map of every ac-
channels are registered in the overlay, the network can con- tive consumer connected to it. In this way, the overlay is in
tinue to operate. It is also a beneﬁt the fact that the overlay a position to perform the scheduling of chunks and nodes
provides a support to the application by locating nodes and mentioned above, and even to do so potentially in a more
keeping channel and relaying node lists, but does not par- efﬁcient way than every consumer on its own, since it has
ticipate in the application itself and thus, since most of the an overall view of the resources in use, how many connec-
tasks are performed directly by the consumers and most of tions is carrying every relay, etc. If the consumers regis-
the signalling travels end to end, the overlay is very scalable tration message would also carry information regarding the
with the number of viewers and channels. On the other hand consumers’ capabilities, such as processing power, storage
there exist some disadvantages. Since most of the tasks capacity and link bandwidth, it would be theoretically pos-
are performed directly by the consumers it also means high sible (albeit mathematically very complex, if not impossible
resource consumption at them. While this might not be a in real time) to compute an optimum distribution of con-
problem for desktop PCs, mobile users with less powerful nections to every relay. However, the possibility of using
devices (PDAs, laptops, etc.) might suffer from excessive heuristics is worth of further exploration for certain scenar-
processing burden, bandwidth waste through signalling and, ios. The architecture presented here has a number of advan-
worst of all, energy consumption. Platform control may also tages; most of them from the point of view of the operator.
be considered to become an issue; since the overlay only Since all decision algorithms are in the hands of the overlay,
keeps information about relaying nodes and channels being there exists a centralized control for better billing capac-
emitted, the possibilities for performing adequate account- ity, free-riding surveillance and resource management. This
ing and billing, would it be the desire of the overlay to do structure permits to have light-consumers, devices which,
so (typical in case that it would be controlled by a network due to the low processing power and signalling load that
operator), would be greatly reduced. they will need, can respect the tight battery and bandwidth
constraints of today’s mobile devices. Since there are no
4.2. Overlay does most perfect solutions, a number of drawbacks also apply. The
main issue concerns the platform scalability. Since the sig-
nalling burden at the overlay increases considerably, a study
In the Overlay does most scenario, the node roles are of scalability would be necessary to evaluate how many and
somewhat reversed. Should a network operator decide to how powerful should the peers be to accommodate this load.
support or (through partnership, direct provision or any
other arrangement) directly provide the video streaming ser-
vice itself on top of its network, it shall very probably have 4.3. Hybrid Approach
a strong interest in controlling much more tightly the infor-
mation interchange across its network, for billing, account- In between the extremes, a number of hybrid approaches
ing and security purposes. Operators are very jealous of are possible, in which the load is distributed to different
the reliability of their networks, which has reached unprece- degrees between the two kinds of nodes, users and peers.
dented levels for other forms of data networking. Hence, it Furthermore, the goal of such a hybrid approach may not
is vital for this scenario that the overlay is aware of, and be to reduce the load at one or the other kind of node, but
furthermore can control, the whole signalling and data in- to distribute the knowledge about the state of the network
terchange. Furthermore, the overlay concentrates the whole and its resources, depending on the scenario. For exam-
information regarding the state of the participating nodes’ ple, in a hybrid case, an overlay would still receive all the
Buffer Maps (albeit in a distributed manner, since the infor- Buffer Maps, which it would forward to the viewers upon
mation for every channel and relay/viewer is actually stored request. However, the selection of which relays to contact
and which chunks to request, arguably the central element overlay to make their presence known. Remember that the
of the live streaming architecture, would reside at the end- user-SIPeer interface is standard SIP. All methods used here
points. This example highlights the load distribution be- are as per the standard, while a number of new events will
tween consumers and peers. Further equilibrium points ex- be used for this use case. SIP UAs start the successful new
ist. As was stated before, if the consumers communicate registration procedure as described in RFC 36651 . Users
their capabilities upon registration, the signalling can be un- send a SIP REGISTER request to their responsible SIPeer,
evenly distributed among them: The overlay could take up which acts as a registrar. The sole purpose of this procedure
more of the signalling for battery-constrained or simpler de- is to establish the Contact: address of the UA and authenti-
vices, impersonating the consumer to a large extent, while cate it as member of the network.
other, more powerful consumers could assume a more in- The SIPeer sends a PUT request to publish, refresh or up-
tense role in the signalling process. Further examples would date information about its associated SIP UA location infor-
be if the operator would not be as much interested in par- mation in the overlay. When the PUT operation completes,
ticipating in the signalling, as in being able to perform an the peer notiﬁes the SIPeer of the completion.
accurate accounting (and posterior billing). In that case, the Once the users have been registered and authenticated,
information about the channels being viewed and the dura- some of them decide to subscribe to the Live Video
tion of the optimal media data assignment sessions might Streaming service. Anna and Boris will subscribe to the
sufﬁce (the extreme Consumer does most case), but maybe ’ListOfChannels’ global event in order to get informed
the operator also wants to be informed about the amount about the updated list of published channels. They will send
of data received, if the billing is dependent upon that. In a SIP SUBSCRIBE message to their responsible SIPeer and
that case, the operator would also be interested in collect- the corresponding SIP NOTIFY message will contain the
ing the different Buffer Maps. It is up to the operator to list of current published channels. Once a new channel is
select which degree of control it wants to have over their published or an existing one becomes updated or even un-
customer’s service consumption. published, this information will be updated into the overlay
and every single responsible SIPeer will notify this event to
5. Consumer does most scenario description its associated users. At this point Anna and Boris get the
current list of existing channels.
In this scenario, peers are organized according to the SEP The consumer initiates a new subscription to the Event:
ListOfChannels presence agent (Admitting node).
protocol, and hence, for communication purposes they form
an overlay which makes use of a Distributed Hash Table The presence agent (admitting node) for ListOfChan-
(DHT) for node and resource location. Chord is one of the email@example.com processes the subscription re-
most popular DHT algorithms for its robustness in handling quest and creates a new subscription. A 200 OK response
churn. Chord is based on a ring logical topology where is sent to conﬁrm the subscription In order to complete
lookup is done in O(LogN ) number of messages. the process, the presence agent (admitting node) sends the
When started up, a node needs to either join the existing consumer (Anna) a NOTIFY with the current state of the
overlay or create a new overlay. In order to join an existing ListOfChannels (i.e. current list of published channels) us-
overlay, the node must ﬁrst locate some peer that is already ing a Content-Type: application/pidf+xml The consumer
participating the overlay. This is common to any layered ar- conﬁrms receipt of the NOTIFY request
chitecture and is known as the bootstrap node location prob- The seed user, which is a television camera, publishes
lem. A number of possibilities exist: cached or well-known the channel being emitted (say, Channel X) and any addi-
bootstrap peer addresses, broadcast bootstrap peer discov- tional information concerning the channel (e.g. the genre,
ery, manual bootstrap peer address conﬁguration, etc. For its encoding, technical characteristics, etc.). This informa-
the purposes of the present discussion, any such mechanism tion will be stored by the overlay, i.e., the SIPeer respon-
would work equally well. sible for this seed (as per the DHT) will update the current
After joining the overlay, a node is able to search other list of channels and will store a new resource record into
peers and resources and share its own resources with the the overlay on behalf of the seed. Assuming that the SIPeer
other peers. The Overlay uses the P2PSIP Peer Protocol node is not the responsible peer for the seed, the content of
for enforcing these operations. SEP is one of the proposed the PUBLISH message will be forwarded through the over-
P2PSIP Peer protocols. Since the bootstrapping and the ini- lay until it reaches its designated peer. From this moment
tialization of the overlay are out of the scope of this docu- on, the overlay knows about the seed emitting Channel X
ment, let’s assume Peers 1, 2, 3 and SIPeers A, B, C and S and can also answer queries about Channel X and who is
have already set up an Overlay. 1 A complete message ﬂow of the whole LiSP has been deﬁned, together
LiSP users, denoted as SIP UAs and represented as Seed, with the detailed content of every message. These details are avoided in
Anna, Boris and Carlos, must ﬁrst of all register with the the text, except where highly relevant, for clarity and lack of space.
emitting it, coming from potential viewers. most of the processing associated with live video streaming.
SIPeer nodes will get (by polling or trapping) the up- At this point in the example, however, only the seed is trans-
dated list of channels and will generate a notiﬁcation mitting. Hence, the viewers will now SUBSCRIBE to the
to those associated users which are subscribing to the endpoint events ’BufferMapUpdate: Channel X’ and ’Zap-
’ListOfChannels’ event, using again a SIP NOTIFY mes- ping: Channel X’ directly at the seed. The overlay does not
sage. keep any information about Buffer Maps and hence does not
A Seed’s UA initiates a SIP PUBLISH to the admitting participate in this interchange. With the ﬁrst subscription,
node in order to update it with new List of Channels in- the corresponding NOTIFYs will send the current Buffer
formation. The Expires header indicates the desired dura- Map to every viewer, so that they can choose which chunks
tion of this soft state. Note that if a Seed decides to go to download from every source. The answer to the second
ofﬂine (ﬁnish the transmission of a channel) it may pub- event will immediately NOTIFY a viewer that a certain re-
lish this channel using an Expires header equal to zero. laying node has changed to viewing (and hence, relaying)
Again, information related to the channel is encoded using another channel and is no longer available as data source.
a Content-Type: application/pidf+xml payload. Anna and Boris will receive the buffer map image from the
The presence agent (admitting node) receives, and ac- seed.
cepts the information. The published data is incorporated Viewers, after computing locally the optimum down-
into the ListOfChannels event document. A 200 OK re- load assignment, start a SIP dialog (started with an INVITE
sponse is sent to conﬁrm the publication. The 200 OK re- transaction) to start a video session with every chosen re-
sponse contains an SIP-ETag header ﬁeld with an entity-tag. lay, specifying in the SDP body what chunks it desires to
This is used to identify the published event state in subse- download. These video sessions will be kept open as long
quent PUBLISH requests. as desired, even if no information is being downloaded at
At this point the seed in ready to broadcast its content, the moment. This serves the viewer to have a number of
but there are no viewers as of yet, although Anna and Boris ”backup” relays, in case that some relay in use will either
know the existence of Channel X. zap to another channel or simply disconnect. Combining
Next, the viewers choose to which channels they would active sessions with on-hold sessions accelerates the activa-
like to SUBSCRIBE (i.e., to watch). Hence, they sub- tion of a substituting relay by the simple re-negotiation of
scribe to the speciﬁc event ’Status: Channel X’ and ’ListOf- SDP parameters.
Sources: Channel X’. To signal the desire to establish an on-hold session, we
The ﬁrst message will keep the viewer informed about follow , which uses the a=inactive parameter. This spec-
any changes in the state of the channel while the corre- iﬁes that the session should be started in inactive mode and
sponding NOTIFY message to the second event contains a no media is sent over an inactive media stream. In order
complete list of all nodes which are transmitting or relaying to activate the session, the consumer may send a SIP reIN-
that channel at the moment (which, so far, is only the seed). VITE message with a=recvonly , which reﬂects its desire to
Should a new node start relaying the channel, or an existing receive media, as explained in .
one stop doing so, the corresponding NOTIFY (sent by the It is up to the viewer to decide how many backup re-
corresponding node) would update that information. Re- lays it wants to keep on hold. Obviously, the more backup
member that the channel itself gets a Resource ID from the relays are being kept, the more signalling will also be inter-
overlay, and that there is one and only one responsible peer changed among them, since the Buffer Maps must be con-
for that Resource ID. Hence, every time that a new node stantly interchanged to calculate which chunks are available
SUBSCRIBEs to a channel through the consumer’s corre- for download. The overlay, for its part, is not affected by
sponding node (different from the channel’s corresponding this signalling, which ﬂows end to end and hence keeps the
node), the content of that message will be routed through overlay more scalable.
the overlay to the channel’s corresponding node, which can Once the session setup has been accepted, an MSRP 
then maintain a global list of people watching and relaying media session starts end-to-end from the seed to the two
the channel, as well as their state. Anna and Boris are sub- viewers in the ﬁgure. MSRP is used for transmitting a se-
scribed to the Channel X and have been inserted as potential ries of related chunks in the context of the session, which
media relays in the list of sources for the Channel X. isnegotiated using the Session Description Protocol (SDP),
It is now up to the users to implement the local algorithm using SIP as a signaling protocol. Considering the Seed,
of their choice (e.g. OTSp2p ) to select to which relay- and Anna and Boris as Consumers, the streaming has now
ing peers to connect in order to receive the data packets. It is truly begun.
also its responsibility to select which chunks of information In the event that new consumers –remember that Car-
or which layers to download from every one of the selected los is already registered–, would join the network, the same
relaying nodes. As stated before, the consumer implements steps would be followed: subscribe to get the list of chan-
Each ﬁle has its own ﬁle transfer identiﬁer, which
uniquely identiﬁes each ﬁle transfer.
Anna receives the SIP INVITE request, inspects the SDP
offer, computes the ﬁle descriptor and ﬁnds a local ﬁle
whose hash equals the one indicated in the SDP. Anna ac-
cepts the ﬁle transmission and creates an SDP answer which
is transmitted in a SIP 200 OK message.
Carlos acknowledges the reception of the 200 OK mes-
sage. Carlos opens a TCP connection to Anna. Anna then
creates an MSRP SEND request that contains the ﬁle. Car-
los acknowledges the reception of the SEND request. The
process would be repeated with Boris.
One of the advantages of using MSRP is that if a
Figure 3. MSRP usage for multi-chunk data TCP connection towards Carlos is already open, and a re-
transport INVITE is sent, Anna re-uses that TCP connection to send
an MSRP SEND request that contains the (desired part of
All the above described steps will be recursively fol-
nels, subscribe event ’Status: Channel X’ and ’ListOf- lowed in the event that new consumers would join the net-
Sources: Channel X’. Once the new viewer gets the list of work, see Fig. 4.
broadcasting nodes, which now contains not only the seed,
but also the two previous viewers, Anna and Boris, which
now can also act as relaying nodes, it subscribes directly
to the nodes of its choice (’BufferMapUpdate: Channel X’
and ’Zapping: Channel X’).
After computing locally the optimum data download as-
signment, the corresponding INVITE will open the media
session with the chosen relays. Should some of the relays be
a backup, then the INVITE will contain an SDP description
putting the media stream on-hold, signalling in this manner
that the session must be kept on hold and no information
So far, it has been assumed that the viewer downloads all
available chunks from each chosen relay in order. Should
it not be so, additional re-invite messages with an SDP Figure 4. Recursive relaying to new watchers.
body specifying which chunks to download serves to no-
tify which chunks should be retrieved next. Again, these
messages travel end to end, without overlay participation. This section has described the fundamental procedures
No SIP provisional responses are considered for the shake for broadcasting live video content in the Internet based
of clarity. on the P2PSIP Peer protocol draft and SIP. But to demon-
Carlos constructs an SDP description of the chunks that strate the validity and not only the feasibility of such an
he wants to receive and attaches the SDP offer to a SIP IN- approach, some form of validation is needed. In this pa-
VITE request addressed to Anna. per, we show through a complexity analysis the scalability
of our approach for very large audiences, to which the next
section is devoted.
m=message 7654 TCP/MSRP *
a=accept-types:message/cpim 6 Evaluation and Results
a=path:msrp://carlospc.university.edu:7654/jshA7we;tcp In order to perform an approximate evaluation of the
a=file-selector:name:Fight Club type:video/msvideo complexity of our architecture, we will concentrate on the
a=file-transfer-id:1 signalling load involved in it. To that end, the SIP and
a=file-range:y-z Peer messages necessary for every major operation in the
network (e.g. the addition of a new relay or the constant
Variables description Event Session Messages (SIP) Overlay Messages (SEP)
N Number of Overlay Peers Peer Join/Leave 0 O(logN )2
P Number of SIPeers Peer Put/Get 0 O(logN )
M Number of users subscribing a global event User Registration 4 O(logN )
J Number of users subscribing a speciﬁc event Global Event subscription 4 0
K Number of users subscribing an endpoint event Global Event notiﬁcation 2 + 2M O(N logN )
L Number of users subscribing an endpoint event with on-hold session l Speciﬁc Event notiﬁcation 2j O(N logN )
j Number of users notiﬁed when a speciﬁc event occurs Speciﬁc Event subscription 2 O(logN )
Endpoint Event subscription 2K 0
Endpoint Event notiﬁcation 2K 0
Figure 5. Main system variables Session Establishment 3K 0
Session Teardown 2K 0
Session Update 3L 0
transmission of Buffer Maps) will be analyzed. Finally, two
Figure 6. Complexity evaluation of the indi-
numerical examples of smaller and larger networks will be
vidual events in the architecture
given to better grasp the results.
Fig. 5 presents the main variables involved in the evalu-
ation. Fig. 6 shows the complexity, for the SIP as well as Procedure Events Signaling Load
for the Peer protocol, of the main events involved in the op- New Channel Global Event Notiﬁcation M + N logN
eration of the architecture, like a new user registration or a New Relay Speciﬁc Event subscription, j + N logN
Speciﬁc Event notiﬁcation
Buffer Map update notiﬁcation. For its part, Fig. 7 presents New Watcher User Registration, Global j +N logN +K +
the cost of the main operations in the architecture, i.e., the Event subscription, Speciﬁc L
Event subscription, Speciﬁc
concatenation of a series of smaller events that necessarily Event notiﬁcation, Endpoint
happen together for the operation to succeed. Event subscription, Endpoint
Event notiﬁcation, Session Es-
It must be remembered that Chord is taken as DHT ex- tablishment, Session Update,
ample for the architecture. Hence, the cost of storing or re- New BufferMap Endpoint Event notiﬁcation K
trieving a piece of information in the DHT (like registering New Peer Peer Join/Leave, Peer Put/Get (logN )2 + logN
a new watcher or publishing a channel, say) is O(logN ).
Another important operation, the introduction of a new peer Figure 7. Complexity evaluation of the typical
(fully dependent on the DHT complexity) is proportional to operations in LiSP
(O(logN )2 ). It follows that, a global event like the intro-
duction of a new relay station, which must be communi-
cated to all peers to update the ListOfSources event, has a cases have been further subdivided in two cases: The ﬁrst
complexity of O(N logN ). This is the dominant factor in in an static environment, where neither the number of peers
the overall system’s cost, which presents a slight overlinear nor of viewers vary (no churn). In the second, churn is an
growth. However, it must be remembered that only peers additional factor, triggering repeatedly a number of addi-
perform such operations, and they represent the minority of tional events, like ListOfSources updates. For this case, it
the nodes. Hence, when looking at the signaling growth be- is considered that the estimated watching time is equal to a
tween the two chosen scenarios in the last rows of table 8, it typical movie’s length, and that the rate of arrival and de-
can be noticed that the increase in signaling load is strongly parture of nodes is equal and constant, and set to 1% for the
sublinear: For an increase in population of a factor 1000, large network and 10% for the smaller one. Furthermore, in
the signalling load increase per node is closer to 100. Con- all cases it is assumed that the video has only one layer and
sequently, the system scales very well for large numbers of that all participating nodes are watching the same channel
consumers, and not so well for large numbers of peers. and cooperatively relaying its content to other nodes.
Another critical operation is the periodic interchange Under these assumptions, it can be seen that the over-
of Buffer Maps among watchers and relays. Since every all signalling involved, for the endpoints as well as for the
watcher downloads that information from K relays, its cost peer nodes is large in numerical value but insigniﬁcant when
is proportional to O(K), but K is small (between 1 and 10, considered per node and compared with the data transmis-
typically) and constant with the number of watchers, M. sion rate. Furthermore, the signalling load remains roughly
Hence, the cost associated with the only periodic system constant with the size of the network, which proves its scal-
operation is small, constant and bounded. ability. These values are consistent for the static as well as
As a conclusion, the results show that the system for the dynamic case (in presence of churn).
presents very good scalability in terms of signalling load Using this numerical values, one can roughly estimate
involved, which is the most critical requirement for a live how powerful the peers and consumers should be. Consid-
streaming architecture. ering a 3-way SIP dialog-creating request, i.e. three SIP
To better exemplify the above results, numerical values messages to set-up a session, it is possible to roughly esti-
for two particular cases have been recorded in table 8. These mate the number of simultaneous calls per second the peers
will have to process. For production operation, it is sug-
gested in [?] the following guideline for sizing server hard-
ware to operate at 60% CPU utilization for some of the most
common SIP software packages (OpenSER V1.2 and SER
V2.0): 1 GHz of CPU processing capacity can manage 60
calls per second. In the small scenario, we can roughly es-
timate 33 calls per second per peer. Capacity that could be
achieved using one GHz of CPU processing capacity at 30%
CPU utilization. For a larger scenario, the number of simul-
taneous calls grows up to 333 calls per second per peer. A
peer with two dual-core 3.0 GHz CPUs would effectively
have (2CP U s ∗ 2cores ∗ 3GHzperCP U ) 12 GHz of CPU
processing capacity. This server, hosting either OpenSER
V1.2 or SER 2.0, would be able to manage this number of
calls per second at approximately 30% CPU utilization.
For consumers, which are typically less powerful termi-
nals, the situation is similar. Since the number of events
subscribing to an endpoint event is constant, the number of
simultaneous calls per second per consumer is exactly the
same for both large and small scenarios: 3 calls per second.
This means a required capacity lower than 100 Mhz. Con-
sidering that current mobile devices offer at least a capacity
of 250-350 MHz, this approach is suitable for existent de-
As a conclusion, this ﬁrst estimation of the complexity
and signaling load involved in our architecture shows very
promising results, which support our belief that it can grow
to very large sizes without severe penalty: with commodity,
existent sw and hw can build nodes able to host very large
7. Conclusions and Future work
This work has presented a layered P2PSIP-based archi-
tecture for live video streaming with ﬂexible application
In this document, a new control plane based on P2PSIP
has been designed and particularized for live video stream-
Figure 8. Signalling load for a small and a ing with ﬂexible application logic placement. This new con-
large streaming network trol plane is based upon the SEP protocol, a novel draft pro-
tocol designed to be used jointly with SIP in P2PSIP sce-
narios, and which is currently being discussed at the IETF.
However, the main architectural characteristics are common
to all the protocol proposals at the P2PSIP WG and there-
fore the results presented here can safely be extrapolated to
all other draft protocols under discussion.
SEP presents a layered architecture, with consumers,
clients and peers performing increasingly complex roles in
the maintenance of the overlay and the management of the
network resources: Consumers are unaware of any overlay,
and act as classical SIP User Agents. Their interface to the
2 Leaving out the limitations in battery power, which are another related
constraint for networked terminals, yet orthogonal to this discussion.
clients and the peers is standard SIP. Clients do know about  M. e. a. Castro. Peer-to-peer overlays: Structured, unstruc-
the overlay, and act as intermediate nodes between the con- tured, or both? Microsoft Research, Tech. Rep. MSR-TR-
sumers and the peers, if need be. Furthermore, they provide 2004-73, Cambridge, UK, 2004.
extra storage capacity to the peers, and basically can change  Y.-H. Chu, S. G. Rao, and H. Zhang. A case for end sys-
their role from peer to client depending on resource avail- tem multicast. In Measurement and Modeling of Computer
Systems, pages 1–12, 2000.
ability. Peers are the members of the overlay, and the only
 M. G.-M. et al. A session description protocol (sdp) of-
ones that participate in the maintenance of it. Only the peers fer/answer mechanism to enable ﬁle transfer, Mar. 2008.
communicate through SEP. INTERNET-DRAFT draft-ietf-mmusic-ﬁle-transfer-mech-
The last part of the document has been devoted to de- 07 (Work in progress).
scribing in detail how SIP and SEP would be combined  J. Jannotti, D. K. Gifford, K. L. Johnson, M. F. Kaashoek,
in an overall architecture to provide the live streaming ser- and J. W. O’Toole, Jr. Overcast: Reliable multicasting with
vice. The role of every node, as well as different applica- an overlay network. pages 197–212.
tion scenarios (powerful nodes, mobile nodes with battery  X. Jiang, H. Zheng, C. Macian, and V. Pascual. Service
extensible p2p peer protocol (sep), Feb. 2008. INTERNET-
and CPU restrictions, large or small groups, etc.) have been
DRAFT draft-jiang-p2psip-sep-01 (Work in progress).
explored. The cost analysis shows that the architecture can  D. Kostic, A. Rodriguez, J. Albrecht, and A. Vahdat. Bullet:
safely scale to very large sizes, even in the presence of sus- High bandwidth data dissemination using an overlay mesh,
tained and heavy churn, which is a requirement for com- 2003.
mercial success of such platforms.  B. Li and H. Yin. Peer-to-peer live video streaming on the
The proposed control plane and live streaming architec- internet: issues, existing approaches, and challenges [peer-
ture presents a number of advantages: A fully decentral- to-peer multimedia streaming]. Communications Magazine,
ized architecture, as is expected of a P2P-based system, IEEE, 45(6):94–99, June 2007.
 X. Liao, H. Jin, Y. Liu, L. M. Ni, and D. Deng. Anysee:
based completely on standardized protocols (or protocols
Peer-to-peer live streaming. INFOCOM 2006. 25th IEEE In-
that are being standardized right now, to be more precise).
ternational Conference on Computer Communications. Pro-
This brings with it the additional advantage of interoper- ceedings, pages 1–10, April 2006.
ability with existing SIP-based applications and services,  V. Pai, K. Kumar, K. Tamilmani, V. Sambamurthy, and
like IM, videoconferencing, online gaming, etc. and their A. Mohr. Chainsaw: Eliminating trees from overlay mul-
corresponding commercial or opensource products. ticast, 2005.
The next steps will take the direction of prototyping the  F. e. a. Pianese. PULSE: An adaptive, incentive-based, un-
proposed control plane while exploring different scenarios. structured P2P live streaming system. IEEE Transactions on
In particular, the Overlay does most scenario, which gives Multimedia, 9(8), December 2007.
 J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston,
a much stronger involvement to the network operator in the
J. Peterson, R. Sparks, M. Handley, and E. Schooler. SIP:
service provision will be analogously analyzed and com- Session Initiation Protocol. RFC 3261 (Proposed Standard),
pared to the Consumer does most case presented here. The June 2002.
subsequent tests and trials will help to reﬁne and amelio-  E. Setton, P. Baccichet, and B. Girod. Peer-to-peer live
rate the architecture presented here, and hopefully pose new multicast: A video perspective. Proceedings of the IEEE,
questions that will drive our research further. 96(1):25–38, Jan. 2008.
 D. Xu, M. Hefeeda, S. Hambrusch, and B. Bhargava. On
peer-to-peer media streaming, 2002.
References  D. e. a. Yang. MPSS: A Multi-agents Based P2P-SIP Real
Time Stream Sharing System, volume 4088 of LNCS Series,
pages 398–408. Springer Verlag, 2006.
 D. Bryan, P. Matthews, E. Shim, , and D. Willis. Con-
 W.-P. Yiu, X. Jin, and S.-H. Chan. Challenges and ap-
cepts and terminology for peer to peer sip, Nov. 2007.
proaches in large-scale p2p media streaming. Multimedia,
INTERNET-DRAFT draft-ietf-p2psip-concepts-01 (Work
IEEE, 14(2):50–59, April-June 2007.
in progress).  X. Zhang, J. Liu, B. Li, and Y.-S. Yum. Coolstream-
 D. A. Bryan, B. B. Lowekamp, and C. Jennings. SOSIM- ing/donet: a data-driven overlay network for peer-to-peer
PLE: A serverless, standards-based, P2P SIP communica- live media streaming. INFOCOM 2005. 24th Annual Joint
tion system. In Proceedings of the AAA-IDEA 2005, June Conference of the IEEE Computer and Communications
2005. Societies. Proceedings IEEE, 3:2102–2111 vol. 3, 13-17
 B. Campbell, R. Mahy, and C. Jennings. The Message Ses- March 2005.
sion Relay Protocol (MSRP). RFC 4975 (Proposed Stan-
dard), Sept. 2007.
 M. Castro, P. Druschel, A. Kermarrec, A. Nandi, A. Row-
stron, and A. Singh. Splitstream: High-bandwidth multicast
in cooperative environments, 2003.