SlideShare a Scribd company logo
1 of 14
Download to read offline
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                          1




                       Live Streaming with
             Receiver-based Peer-division Multiplexing
                                     Hyunseok Chang              Sugih Jamin         Wenjie Wang




   Abstract—A number of commercial peer-to-peer systems for              soon as the application assesses it has sufficient data buffered
live streaming have been introduced in recent years. The behavior        that, given the estimated download rate and the playback rate,
of these popular systems has been extensively studied in several         it will not deplete the buffer before the end of file. If this
measurement papers. Due to the proprietary nature of these
commercial systems, however, these studies have to rely on a             assessment is wrong, the application would have to either
“black-box” approach, where packet traces are collected from             pause playback and rebuffer, or slow down playback. While
a single or a limited number of measurement points, to infer             users would like playback to start as soon as possible, the
various properties of traffic on the control and data planes.             application has some degree of freedom in trading off playback
Although such studies are useful to compare different systems            start time against estimated network capacity. Most video-on-
from end-user’s perspective, it is difficult to intuitively under-
stand the observed properties without fully reverse-engineering          demand systems are examples of delay-sensitive progressive-
the underlying systems. In this paper we describe the network            download application. The third case, real-time live streaming,
architecture of Zattoo, one of the largest production live stream-       has the most stringent delay requirement. While progressive
ing providers in Europe at the time of writing, and present a            download may tolerate initial buffering of tens of seconds or
large-scale measurement study of Zattoo using data collected by          even minutes, live streaming generally cannot tolerate more
the provider. To highlight, we found that even when the Zattoo
system was heavily loaded with as high as 20,000 concurrent              than a few seconds of buffering. Taking into account the
users on a single overlay, the median channel join delay remained        delay introduced by signal ingest and encoding, and network
less than 2 to 5 seconds, and that, for a majority of users, the         transmission and propagation, the live streaming system can
streamed signal lags over-the-air broadcast signal by no more            introduce only a few seconds of buffering time end-to-end and
than 3 seconds.                                                          still be considered “live” [1].
  Index Terms—Peer-to-peer system, live streaming, network                  The Zattoo peer-to-peer live streaming system was a free-
architecture
                                                                         to-use network serving over 3 million registered users in eight
                                                                         European countries at the time of study, with a maximum
                        I. I NTRODUCTION                                 of over 60,000 concurrent users on a single channel. The
                                                                         system delivers live streams using a receiver-based, peer-
T     HERE is an emerging market for IPTV. Numerous com-
      mercial systems now offer services over the Internet that
are similar to traditional over-the-air, cable, or satellite TV.
                                                                         division multiplexing scheme as described in Section II. To
                                                                         ensure real-time performance when peer uplink capacity is
                                                                         below requirement, Zattoo subsidizes the network’s bandwidth
Live television, time-shifted programming, and content-on-
                                                                         requirement, as described in Section III. After delving into
demand are all presently available over the Internet. Increased
                                                                         Zattoo’s architecture in detail, we study in Sections IV and V
broadband speed, growth of broadband subscription base, and
                                                                         large-scale measurements collected during the live broadcast
improved video compression technologies have contributed to
                                                                         of the UEFA European Football Championship, one of the
the emergence of these IPTV services.
                                                                         most popular one-time events in Europe, in June, 2008 [2].
   We draw a distinction between three uses of peer-to-peer
                                                                         During the course of the month of June 2008, Zattoo served
(P2P) networks: delay tolerant file download of archival ma-
                                                                         more than 35 million sessions to more than one million distinct
terial, delay sensitive progressive download (or streaming) of
                                                                         users. Drawing from these measurements, we report on the
archival material, and real-time live streaming. In the first
                                                                         operational scalability of Zattoo’s live streaming system along
case, the completion of download is elastic, depending on
                                                                         several key issues:
available bandwidth in the P2P network. The application
buffer receives data as it trickles in and informs the user                1) How does the system scale in terms of overlay size and
upon the completion of download. The user can then start                      its effectiveness in utilizing peers’ uplink bandwidth?
playing back the file for viewing in the case of a video                    2) How responsive is the system during channel switching,
file. Bittorrent and variants are example of delay-tolerant file                for example, when compared to the 3-second channel
download systems. In the second case, video playback starts as                switch time of satellite TV?
                                                                           3) How effective is the packet retransmission scheme in
  H. Chang is with Alcatel-Lucent Bell Labs, Holmdel, NJ 07733 USA (e-
mail: hyunseok.chang@alcatel-lucent.com).                                     allowing a peer to recover from transient congestion?
  S. Jamin is with EECS Department, University of Michigan, Ann Arbor,     4) How effective is the receiver-based peer-division multi-
MI 48109 USA (e-mail: jamin@eecs.umich.edu).                                  plexing scheme in delivering synchronized sub-streams?
  W. Wang is with IBM Ressearch, CRL, Beijing 100193, China (e-mail:
wenjwang@cn.ibm.com).                                                      5) How effective is the global bandwidth subsidy system
  This work was done when authors Chang and Wang were at Zattoo Inc.          in provisioning for flash crowd scenarios?
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
                                                                                                                                     2



  6) Would a peer further away from the stream source
      experience adversely long lag compared to a peer closer
      to the stream source?
  7) How effective is error-correcting code in isolating packet
                                                                                                        Encoding
      losses on the overlay?                                                                             Servers

We also discuss in Section VI several challenges in in-
                                                                          Demultiplexer
creasing the bandwidth contribution of Zattoo peers. Finally,
we describe related works in Section VII and conclude in
Section VIII.
                                                                                Authentication Server

                                                                                  Rendezvous Server

                 II. S YSTEM A RCHITECTURE                                           Feedback Server

                                                                                 other admin servers
   The Zattoo system rebroadcasts live TV, captured from
satellites, onto the Internet. The system carries each TV
channel on a separate peer-to-peer delivery network and is not        Fig. 1.      Zattoo delivery network architecture.
limited in the number of TV channels it can carry. Although
a peer can freely switch from one TV channel to another, and
                                                                      a “segment.” Thus m serves as the segment index, while i
thereby departing and joining different peer-to-peer networks,
                                                                      serves as the packet index within a segment. Each segment
it can only join one peer-to-peer network at any one time.
                                                                      is of size n packets. Being the packet index, i also serves as
We henceforth limit our description of the Zattoo delivery
                                                                      the sub-stream index. The number mn + i is carried in each
network as it pertains to carrying one TV channel. Fig. 1
                                                                      packet as its sequence number.
shows a typical setup of a single TV channel carried on the
                                                                         Zattoo uses the Reed-Solomon (RS) error correcting code
Zattoo network. TV signal captured from satellite is encoded
                                                                      (ECC) for forward error correction. The RS code is a sys-
into H.264/AAC streams, encrypted, and sent onto the Zattoo
                                                                      tematic code: of the n packets sent per segment, k < n
network. The encoding server may be physically separated
                                                                      packets carry the live stream data while the remainder carries
from the server delivering the encoded content onto the Zattoo
                                                                      the redundant data [3, Section 7.3]. Due to the variable-bit
network. For ease of exposition, we will consider the two as
                                                                      rate nature of the data stream, the time period covered by a
logically co-located on an Encoding Server. Users are required
                                                                      segment is variable, and a packet may be of size less than the
to register themselves at the Zattoo website to download a free
                                                                      maximum packet size. A packet smaller than the maximum
copy of the Zattoo player application. To receive the signal
                                                                      packet size is zero-padded to the maximum packet size for
of a channel, the user first authenticates itself to the Zattoo
                                                                      the purposes of computing the (shortened) RS code, but is
Authentication Server. Upon authentication, the user is granted
                                                                      transmitted in its original size. Once a peer has received k
a ticket with limited lifetime. The user then presents this ticket,
                                                                      packets per segment, it can reconstruct the remaining n − k
along with the identity of the TV channel of interest, to the
                                                                      packets. We do not differentiate between streaming data and
Zattoo Rendezvous Server. If the ticket specifies that the user
                                                                      redundant data in our discussion in the remainder of this paper.
is authorized to receive signal of the said TV channel, the
                                                                         When a new peer requests to join an existing peer, it
Rendezvous Server returns to the user a list of peers currently
                                                                      specifies the sub-stream(s) it would like to receive from the
joined to the P2P network carrying the channel, together with
                                                                      existing peer. These sub-streams do not have to be consecutive.
a signed channel ticket. If the user is the first peer to join a
                                                                      Contingent upon availability of bandwidth at existing peers,
channel, the list of peers it receives contain only the Encoding
                                                                      the receiving peer decides how to multiplex a stream onto
Server. The user joins the channel by contacting the peers
                                                                      its set of neighboring peers, giving rise to our description of
returned by the Rendezvous Server, presenting its channel
                                                                      the Zattoo live streaming protocol as a receiver-based, peer-
ticket, and obtaining the live stream of the channel from them
                                                                      division multiplexing protocol. The details of peer-division
(see Section II-A for details).
                                                                      multiplexing is described in Section II-A while the details of
   Each live stream is sent out by the Encoding Server as
                                                                      how a peer manages sub-stream forwarding and stream recon-
n logical sub-streams. The signal received from satellite is
                                                                      struction is described in Section II-B. Receiver-based peer-
encoded into a variable-bit rate stream. During periods of
                                                                      division multiplexing has also been used by the latest version
source quiescence, no data is generated. During source busy
                                                                      of CoolStreaming peer-to-peer protocol though it differs from
periods, generated data is packetized into a packet stream,
                                                                      Zattoo in its stream management (Section II-B) and adaptive
with each packet limited to a maximum size. The Encoding
                                                                      behavior (Section II-C) [4].
Server multiplexes this packet stream onto the Zattoo network
as n logical sub-streams. Thus the first packet generated is
considered part of the first sub-stream, the second packet that        A. Peer-Division Multiplexing
of the second sub-stream, the n-th packet that of the n-th sub-          To minimize per-packet processing time of a stream, the
stream. The n+1-th packet cycles back to the first sub-stream,         Zattoo protocol sets up a virtual circuit with multiple fan outs
etc. such that the i-th sub-stream carries the mn+i-th packets,       at each peer. When a peer joins a TV channel, it establishes
where m ≥ 0, 1 ≤ i ≤ n, and n a user-defined constant.                 a peer-division multiplexing (PDM) scheme amongst a set of
We call a set of n packets with the same index multiplier m           neighboring peers, by building a virtual circuit to each of the
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                                   3



neighboring peers. Baring departure or performance degrada-
tion of a neighbor peer, the virtual circuits are maintained
until the joining peer switches to another TV channel. With the
virtual circuits set up, each packet is forwarded without further
per-packet handshaking between peers. We describe the PDM
boot strapping mechanism in this section and the adaptive
PDM mechanism to handle peer departure and performance
degradation in Section II-C.
   The PDM establishment process consists of two phases:                                                         PDM
the search phase and the join phase. In the search phase, the
new, joining peer determines its set of potential neighbors. In
the join phase, the joining peer requests peering relationships
with a subset of its potential neighbors. Upon acceptance of a
peering relationship request, the peers become neighbors and
a virtual circuit is formed between them.                                                                                    Zattoo
   Search phase. To obtain a list of potential neighbors, a                                                                  Peer and
                                                                                                                             Player
joining peer sends out a SEARCH message to a random subset                                                                   Application
of the existing peers returned by the Rendezvous Server. The
SEARCH message contains the sub-stream indices for which
this joining peer is looking for peering relationships. The sub-
stream indices is usually represented as a bitmask of n bits,
where n is the number of sub-streams defined for the TV
                                                                      Fig. 2.   Zattoo peer with IOB.
channel. In the beginning, the joining peer will be looking for
peering relationships for all sub-streams and have all the bits
in the bitmask turned on. In response to a SEARCH message,            to be its subnet number, autonomous system (AS) number,
an existing peer replies with the number of sub-streams it can        and country code, in that order of precedence. A joining
forward. From the returning SEARCH replies, the joining peer          peer obtains its own topological location from the Zattoo
constructs a set of potential neighbors that covers the full set of   Authentication Server as part of its authentication process.
sub-streams comprising the live stream of the TV channel. The         The list of peers returned by both the Rendezvous Server
joining peer continues to wait for SEARCH replies until the           and potential neighbors all come attached with topological
set of potential neighbors contains at least a minimum number         locations. A topology-aware overlay not only allows us to
of peers, or until all SEARCH replies have been received.             be “ISP-friendly,” by minimizing inter-domain traffic and
With each SEARCH reply, the existing peer also returns a              thus save on transit bandwidth cost, but also helps reduce
random subset of its known peers. If a joining peer cannot            the number of physical links and metro hops traversed in
form a set of potential neighbors that covers all of the sub-         the overlay network, potentially resulting in enhanced user-
streams of the TV channel, it initiates another SEARCH round,         perceived stream quality.
sending SEARCH messages to peers newly learned from the
previous round. The joining peer gives up if it cannot obtain         B. Stream Management
the full stream after two SEARCH rounds. To help the joining
                                                                         We represent a peer as a packet buffer, called the IOB,
peer synchronize the sub-streams it receives from multiple
                                                                      fed by sub-streams incoming from the PDM constructed as
peers, each existing peer also indicates for each sub-stream
                                                                      described in Section II-A.1 The IOB drains to (1) a local
the latest sequence number it has received for that sub-stream,
                                                                      media player if one is running, (2) a local file if recording
and the existence of any quality problem. The joining peer can
                                                                      is supported, and (3) potentially other peers. Fig. 2 depicts
then choose sub-streams with good quality that are closely
                                                                      a Zattoo player application with virtual circuits established to
synchronized.
                                                                      four peers. As packets from each sub-stream arrive at the peer,
   Join phase. Once the set of potential neighbors is estab-
                                                                      they are stored in the IOB for reassembly to reconstruct the
lished, the joining peer sends JOIN requests to each potential
                                                                      full stream. Portions of the stream that have been reconstructed
neighbor. The JOIN request lists the sub-streams for which
                                                                      are then played back to the user. In addition to providing
the joining peer would like to construct virtual circuit with the
                                                                      a reassembly area, the IOB also allows a peer to absorb
potential neighbor. If a joining peer has l potential neighbors,
                                                                      some variabilities in available network bandwidth and network
each willing to forward it the full stream of a TV channel, it
                                                                      delay.
would typically choose to have each forward only 1/l-th of the
                                                                         The IOB is referenced by an input pointer, a repair pointer,
stream, to spread out the load amongst the peers and to speed
                                                                      and one or more output pointers. The input pointer points
up error recovery, as described in Section II-C. In selecting
                                                                      to the slot in the IOB where the next incoming packet with
which of the potential neighbors to peer with, the joining
                                                                      sequence number higher than the highest sequence number
peer gives highest preference to topologically close-by peers,
even if these peers have less capacity or carry lower quality           1 In the case of the Encoding Server, which we also consider a peer on the
sub-streams. The “topological” location of a peer is defined           Zattoo network, the buffer is fed by the encoding process.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                                                     4


                                                                                         1            4                      9                    14     n
received so far will be stored. The repair pointer always
points one slot beyond the last packet received in order and is         segment 0
used to regulate packet retransmission and adaptive PDM as
described later. We assign an output pointer to each forwarding        segment m-1
destination. The output pointer of a destination indicates the
                                                                                                      SKIP                                    NEED
destination’s current forwarding horizon on the IOB. In accor-
dance to the three types of possible forwarding destinations                                          SENT                                    READY
listed above, we have three types of output pointers: player
pointer, file pointer, and peer pointer. One would typically            Fig. 3.      Packet map associated with a peer pointer.
have at most one player pointer and one file pointer but
potentially multiple concurrent peer pointers, referencing an
                                                                                       IOB                 segment size: n
IOB. The Zattoo player application does not currently support
                                                                          repair                                                                File
recording.                                                                pointer                                                                        Packet
                                                                                                                                                         Map
   Since we maintain the IOB as a circular buffer, if the                 segment 0

incoming packet rate is higher than the forwarding rate of a                  ...
                                                                                                                                                Peer0
particular destination, the input pointer will overrun the output                                                                                        Packet
                                                                                                                                                         Map
pointer of that destination. We could move the output pointer           segment m-1

to match the input pointer so that we consistently forward the                                             sub-buffer 0                         Peer1
                                                                                                                                                         Packet
oldest packet in the IOB to the destination. Doing so, however,                                                                                          Map
                                                                          segment 0
requires checking the input pointer against all output pointers
on every packet arrival. Instead, we have implemented the IOB           input
                                                                              ...
                                                                                                                                                Player
                                                                                                                                                         Packet
as a double buffer. With the double buffer, the positions of the        pointer                                                                          Map

                                                                        segment m-1
output pointers are checked against that of the input pointer
                                                                                                           sub-buffer 1
only when the input pointer moves from one sub-buffer to the
other. When the input pointer moves from sub-buffer a to sub-                                  received packet                   empty slot

buffer b, all the output pointers still pointing to sub-buffer b are
                                                                       Fig. 4.      IOB, input/output pointers and packet maps.
moved to the start of sub-buffer a and sub-buffer b is flushed,
ready to accept new packets. When a sub-buffer is flushed
while there are still output pointers referencing it, packets that
have not been forwarded to the destinations associate with             The player pointer behaves the same as a peer pointer except
those pointers are lost to them, resulting in quality degradation.     that all packets in its packet map will always start out marked
To minimize packet lost due to sub-buffer flushing, we would            NEEDed.
like to use large sub-buffers. However, the real-time delay               Fig. 4 shows an IOB consisting of a double buffer, with an
requirement of live streaming limits the usefulness of late            input pointer, a repair pointer, and an output file pointer, an
arriving packets and effectively puts a cap on the maximum             output player pointer, and two output peer pointers referencing
size of the sub-buffers.                                               the IOB. Each output pointer has a packet map associated
   Different peers may request for different numbers of, possi-        with it. For the scenario depicted in the figure, the player
bly non-consecutive, sub-streams. To accommodate the differ-           pointer tracks the input pointer and has skipped over some
ent forwarding rates and regimes required by the destinations,         lost packets. Both peer pointers are lagging the input pointer,
we associate a packet map and forwarding discipline with each          indicating that the forwarding rates to the peers are bandwidth
output pointer. Fig. 3 shows the packet map associated with an         limited. The file pointer is pointing at the first lost packet.
output peer pointer where the peer has requested sub-streams           Archiving a live stream to file does not impose real-time delay
1, 4, 9, and 14. Every time a peer pointer is repositioned             bound on packet arrivals. To achieve the best quality recording
to the beginning of a sub-buffer of the IOB, all the packet            possible, a recording peer always waits for retransmission of
slots of the requested sub-streams are marked NEEDed and               lost packets that cannot be recovered by error correction.
all the slots of the sub-streams not requested by the peer are            In addition to achieving lossless recording, we use re-
marked SKIP. When a NEEDed packet arrives and is stored                transmission to let a peer recover from transient network
in the IOB, its state in the packet map is changed to READY.           congestion. A peer sends out a retransmission request when
As the peer pointer moves along its associated packet map,             the distance between the repair pointer and the input pointer
READY packets are forwarded to the peer and their states               has reached a threshold of R packet slots, usually spanning
changed to SENT. A slot marked NEEDed but not READY,                   multiple segments. A retransmission request consists of an R-
such as slot n + 4 in Fig. 3, indicates that the packet is lost        bit packet mask, with each bit representing a packet, and the
or will arrive out-of-order and is bypassed. When an out-of-           sequence number of the packet corresponding to the first bit.
order packet arrives, its slot is changed to READY and the             Marked bits in the packet mask indicate that the corresponding
peer pointer is reset to point to this slot. Once the out-of-order     packets need to be retransmitted. When a packet loss is
packet has been sent to the peer, the peer pointer will move           detected, it could be caused by congestion on the virtual
forward, bypassing all SKIP, NEED, and SENT slots until it             circuits forming the current PDM or congestion on the path
reaches the next READY slot, where it can resume sending.              beyond the neighboring peers. In either case, current neighbor
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                                     5



peers will not be good sources of retransmitted packets. Hence                  a virtual circuit. To reduce the instability introduced into the
we send our retransmission requests to r random peers that are                  network, a peer closes first the virtual circuit carrying the
not neighbor peers. A peer receiving a retransmission request                   smallest number of sub-streams.
will honor the request only if the requested packets are still                     A peer attempts to increase its available uplink bandwidth
in its IOB and it has sufficient left-over capacity, after serving               estimate periodically: if it has fully utilized its current estimate
its current peers, to transmit all the requested packets. Once a                of available uplink bandwidth without triggering any bad
retransmission request is accepted, the peer will retransmit all                quality feedback from neighboring peers. A peer doubles the
the requested packets to completion.                                            estimated available uplink bandwidth if current estimate is
                                                                                below a threshold, switching to linear increase above the
                                                                                threshold, similar to how TCP maintains its congestion win-
C. Adaptive PDM
                                                                                dow size. A peer also increases its estimate of available uplink
   While we rely on packet retransmission to recover from                       bandwidth if a neighbor peer departs the network without any
transient congestions, we have two channel capacity adjust-                     bad quality feedback.
ment mechanisms to handle longer-term bandwidth fluctua-                            When the repair pointer lags behind the input pointer by R
tions. The first mechanism allows a forwarding peer to adapt                     packet slots, in addition to initiating a retransmission request,
the number of sub-streams it will forward given its current                     a peer also computes a loss rate over the R packets. If the
available bandwidth, while the second allows the receiving                      loss rate is above a threshold, the peer considers the neighbor
peer to switch provider at the sub-stream level.                                slow and attempts to reconfigure its PDM. In reconfiguring
   Peers on the Zattoo network can redistribute a highly                        its PDM, a peer attempts to shift half of the sub-streams
variable number of sub-streams, reflecting the high variability                  currently forwarded by the slow neighbor to other existing
in uplink bandwidth of different access network technologies.                   neighbors. At the same time, it searches for new peer(s) to
For a full-stream consisting of sixteen constant-bit rate sub-                  forward these sub-streams. If new peer(s) are found, the load
streams, our prior study show that based on realistic peer                      will be shifted from existing neighbors to the new peer(s).
characteristics measured from the Zattoo network, half of the                   If sub-streams from the slow neighbor continues to suffer
peers can support less than half of a stream, 82% of peers can                  after the reconfiguration of the PDM, the peer will drop the
support less than a full-stream, and the remainder can support                  neighbor completely and initiate another reconfiguration of the
up to ten full streams (peers that can redistribute more than                   PDM. When a peer loses a neighbor due to reduced available
a full stream is conventionally known as supernodes in the                      uplink bandwidth at the neighbor or due to neighbor departure,
literature) [5]. With variable-bit rate streams, the bandwidth                  it also initiates a PDM reconfiguration. A peer may also
carried by each sub-stream is also variable. To increase peer                   initiate a PDM reconfiguration to switch to a topologically
bandwidth usage, without undue degradation of service, we                       closer peer. Similar to the PDM establishment process, PDM
instituted measurement-based admission control at each peer.                    reconfiguration is accomplished by peers exchanging sub-
In addition to controlling resource commitment, another goal                    stream bitmasks in a request/response handshake, with each
of the measurement-based admission control module is to                         bit of the bitmask representing a sub-stream. During and after
continually estimate the amount of available uplink bandwidth                   a PDM reconfiguration, slow neighbor detection is disabled
at a peer.                                                                      for a short period of time to allow for the system to stabilize.
   The amount of available uplink bandwidth at a peer is
initially estimated by the peer sending a pair of probe packets                         III. G LOBAL BANDWIDTH S UBSIDY S YSTEM
to Zattoo’s Bandwidth Estimation Server. Once a peer starts                        Each peer on the Zattoo network is assumed to serve a
forwarding sub-streams to other peers, it will receive from                     user through a media player, which means that each peer
those peers quality-of-service feedbacks that inform its update                 must receive, and can potentially forward, all n sub-streams
of available uplink bandwidth estimate. A peer sends quality-                   of the TV channel the user is watching. The limited redistri-
of-service feedback only if the quality of a sub-stream drops                   bution capacity of peers on the Zattoo network means that
below a certain threshold.2 Upon receiving quality feedback                     a typical client can contribute only a fraction of the sub-
from multiple peers, a peer first determines if the identified                    streams that make up a channel. This shortage of bandwidth
sub-streams are arriving in low quality. If so, the low quality of              leads to a global bandwidth deficit in the peer-to-peer net-
service may not be caused by limit on its own available uplink                  work. Whereas bittorrent-like delay-tolerant file downloads or
bandwidth; in which case, it ignores the low quality feedbacks.                 the delay-sensitive progressive download of video-on-demand
Otherwise, the peer decrements its estimate of available uplink                 applications can mitigate such global bandwidth shortage by
bandwidth. If the new estimate is below the bandwidth needed                    increasing download time, a live streaming system such as
to support existing number of virtual circuits, the peer closes                 Zattoo’s must subsidize the bandwidth shortfall to provide
                                                                                real-time delivery guarantee.
   2 Depending on a peer’s NAT and/or firewall configuration, Zattoo uses
                                                                                   Zattoo’s Global Bandwidth Subsidy System (or simply, the
either UDP or TCP as the underlying transport protocol. The quality of a sub-
stream is measured differently for UDP and TCP. A packet is considered lost     Subsidy System), consists of a global bandwidth monitoring
under UDP if it doesn’t arrive within a fixed threshold. The quality measure     subsystem, a global bandwidth forecasting and provisioning
for UDP is computed as a function of both the packet lost rate and the burst    subsystem, and a pool of Repeater nodes. The monitoring
error rate (number of contiguous packet losses). The quality measure for TCP
is defined to be how far behind a peer is, relative to other peers, in serving   subsystem continuously monitors the global bandwidth re-
its sub-streams.                                                                quirement of a channel. The forecasting and provisioning
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                     6



subsytem projects global bandwidth requirement based on              •   Stable: the ratio U has remained within [S- , S+ ] for
measured history and allocates Repeater nodes to the chan-               the past Ts reporting periods.
nel as needed. The monitoring and provisioning of global              • Exploding: the ratio U increased by at least E between
bandwidth is complicated by two highly varying parameters                Te (e.g., Te = 2) reporting periods.
over time, client population size and peak streaming rate, and        • Increasing: the ratio U has steadily increased by I (I
one varying parameter over space, available uplink bandwidth,            E) over the past Ti reporting periods.
which is network-service provider dependent. Forecasting of           • Falling: the ratio U has decreased by F over the past Tf
bandwidth requirement is a vast subject in itself. Zattoo                reporting periods.
adopted a very simple mechanism, described in Section III-B           Orthogonal to the capacity-trend based classification above,
which has performed adequately in provisioning the network         each channel is further categorized in terms of its capacity
for both daily demand fluctuations and flash crowds scenarios        utilization ratio as follows.
(see Section IV-C).                                                   • Under-utilized: the ratio U is below the low threshold
   When a bandwidth shortage is projected for a channel, the             L, e.g., U ≤ 0.5.
Subsidy System assigns one or more Repeater nodes to the              • Near Capacity: the ratio U is almost 1.0, e.g., U ≥ 0.9.
channel. Repeater nodes function as bandwidth multiplier, to
                                                                      If the capacity trend of a channel has been “Exploding,” one
amplify the amount of available bandwidth in the network.
                                                                   or more Repeater nodes will be assigned to it immediately. If
Each Repeater node serves at most one channel at a time;
                                                                   the channel’s capacity trend has been “Increasing,” it will be
it joins and leaves a given channel at the behest of the
                                                                   assigned a Repeater node with a smaller capacity. The goal
Subsidy System. Repeater nodes receive and serve all n
                                                                   of the Subsidy System is to keep a channel below “Near
sub-streams of the channel they join, run the same PDM
                                                                   Capacity.” If a channel’s capacity trend is “Stable” and the
protocol, and are treated by actual peers like any other peers
                                                                   channel is “Under-utilized,” the Subsidy System attempts to
on the network; however, as bandwidth amplifiers, they are
                                                                   free Repeater nodes (if any) assigned to the channel. If a
usually provisioned to contribute more uplink bandwidth than
                                                                   channel’s capacity utilization is “Falling,” the Subsidy System
the download bandwidth they consume. The use of Repeater
                                                                   waits for the utilization to stabilize before reassigning any
nodes makes the Zattoo network a hybrid P2P and content
                                                                   Repeater nodes.
distribution network.
                                                                      Each Repeater node periodically sends a keep-alive message
   We next describe the bandwidth monitoring subsystem of
                                                                   to the Subsidy System. The keep-live message tells the Sub-
the Subsidy System, followed by design of the simple band-
                                                                   sidy System which channel the Repeater node is serving, plus
width projection and Repeater node assignment subsystem.
                                                                   its CPU and capacity utilization ratio. This allows the Subsidy
                                                                   System to monitor the health of Repeater nodes and to increase
A. Global Bandwidth Measurement                                    the stability of the overlay during Repeater reassignment.
                                                                   When reassigning Repeater nodes from a channel, the Subsidy
  The capacity metric of a channel is the tuple (D, C),
                                                                   System will start from the Repeater node with the lowest
where D is the aggregate download rates required by all
                                                                   utilization ratio. It will notify the selected Repeater node to
users on the channel, and C is the aggregate upload capacity
                                                                   stop accepting new peers and then to leave the channel after
of those users. Usually C < D and the difference between
                                                                   a specified time.
the two is the global bandwidth deficit of the channel. Since
                                                                      In addition to Repeater nodes, the Subsidy System may
channel viewership changes over time as users join or leave
                                                                   recognize extra capacity from idle peers whose owners are
the channel, we need a scalable means to measure and update
                                                                   not actively watching a channel. However, our previous study
the capacity metric. We rely on aggregating the capacity
                                                                   shows that a large number of idle peers are required to make
metric up the peer division multiplexing tree. Each peer in the
                                                                   any discernible impact on the global bandwidth deficit of a
overlay periodically aggregates the capacity metric reported
                                                                   channel [5]. Our current Subsidy System therefore does not
by all its downstream receiver peers, adds its own capacity
                                                                   solicit bandwidth contribution from idle peers.
measure (D, C) to the aggregate, and forwards the resulting
capacity metric upstream to its forwarding peers. By the time
the capacity metric percolates up to the Encoding Server, it                   IV. S ERVER - SIDE M EASUREMENTS
contains the total download and upload rate aggregates of the         In the Zattoo system, two separate centralized collector
whole streaming overlay. The Encoding Server then simply           servers collect usage statistics and error reports, which we
forwards the obtained (D, C) to the Subsidy Server.                call the “stats” server and the “user-feedback” server re-
                                                                   spectively. The “stats” server periodically collects aggregated
                                                                   player statistics from individual peers, from which full session
B. Global Bandwidth Projection and Provisioning                    logs are constructed and entered into a session database.
   For each channel, the Subsidy Server keeps a history of the     The session database gives a complete picture of all past
capacity metric (D, C) reports received from the channel’s         and present sessions served by the Zattoo system. A given
Encoding Server. The channel utilization ratio (U ) is the ratio   database entry contains statistics about a particular session,
D over C. Based on recent movements of the ratio U , we            which includes join time, leave time, uplink bytes, download
classify the capacity trend of each channel into the following     bytes, and channel name associated with the session. We
four categories.                                                   study the sessions generated on three major TV channels from
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                      7


                            TABLE I                                                               TABLE III
             S ESSION DATABASE (6/1/2008–6/30/2008).                                        AVERAGE SHARING RATIO .
               Channel   # sessions   # distinct users                                              Average sharing ratio
                                                                                          Channel
                ARD      2,102,638        298,601                                                   Off-peak      Peak
               Cuatro    1,845,843        268,522                                         ARD        0.335       0.313
                SF2      1,425,285        157,639                                         Cuatro     0.242       0.224
                                                                                           SF2       0.277       0.222

                           TABLE II
              F EEDBACK LOGS (6/20/2008–6/29/2008).
               Channel   # feedback logs   # sessions                sharing ratio is defined as total users’ uplink rate divided
                ARD            871           1,253                   by their download rate on the channel. A sharing ratio of
               Cuatro         2,922          4,568                   one means users contribute to other peers in the network as
                SF2            656           1,140
                                                                     much traffic as they download at the time. We calculate the
                                                                     average sharing ratio from the total download/uplink bytes
                                                                     of the collected sessions. We first obtain all the sessions
three different countries (Germany, Spain, and Switzerland),         which are active across time i. We call a set of such
from June 1st to June 30th, 2008. Throughout the paper, we           sessions Si . Then assuming uplink/download bytes of each
label those channels from Germany, Spain, and Switzerland as         session are spread uniformly throughout the entire session
ARD, Cuatro, and SF2, respectively. Euro 2008 games were             duration, we approximate the average sharing ratio at time
                                                                                     uplink bytes(i)/duration(i)
held during this period, and those three channels broadcast a        i as     i∈Si
                                                                                                                   .
majority of the Euro 2008 games including the final match.                          download bytes(i)/duration(i)
                                                                            i∈Si

See Table I for information about the collected session data            Fig. 5 shows the overlay size (i.e., number of concurrent
sets.                                                                users) and average sharing ratio super-imposed across the
   The “user-feedback” server, on the other hand, collects           month of June, 2008. According to the figure, the overlay
users’ error logs submitted asynchronously by users. The “user       size grew to more than 20,000 (e.g., 20,059 on ARD on 6/18
feedback” data here is different from peer’s quality feedback        and 22,152 on Cuatro on 6/9). As opposed to the overlay
used in PDM reconfiguration described in Section II-C. Zat-           size, the average sharing ratio tends to stay flatter throughout
too player maintains an encrypted log file which contains,            the month. Occasional spikes in the sharing ratio all occurred
for debugging purposes, detailed behavior of client-side P2P         during 2AM to 6AM (GMT) when the channel usage is very
engine, as well as history of all the streaming sessions initiated   low, and therefore may be considered statistically insignificant.
by a user since the player startup. When users encounter                By segmenting a 24-hour day into two time periods, e.g.,
any error while using the player, such as log-in error, join         off-peak hours (0AM-6PM) and peak hours (6PM-0AM),
failure, bad quality streaming etc., they can choose to report       Table III shows the average sharing ratio in the two time
the error by clicking a “Submit Feedback” button on the              periods separately. Zattoo’s usage during peak hours typically
player, which causes the Zattoo player to send the generated         accounts for about 50% to 70% of the total usage of the
log file to the user-feedback server. Since a given feedback          day. According to the table, the average sharing ratio during
log not only reports on a particular error, but also describes       peak hours is slightly lower than, but not very much different
“normal” sessions generated prior to the occurrence of the           from during off-peak hours. Cuatro channel in Spain exhibits
error, we can study user’s viewing experience (e.g., channel         relatively lower sharing ratio than the two other channels. One
join delay) from the feedback logs. Table II describes the           bandwidth test site [6] reports that average uplink bandwidth in
feedback logs collected from June 20th to June 29th. A given         Spain is about 205 kbps, which is much lower than in Germany
feedback log can contain multiple sessions (for the same or          (582 kbps) and Switzerland (787 kbps). The lower sharing
different channels), depending on user’s viewing behavior.           ratio on the Spanish channel may reflect regional difference
The second column in the table represents the number of              in residential access network provisioning. The balance of the
feedback logs which contain at least one session generated           required bandwidth is provided by Zattoo’s Encoding Server
on the channel listed in the corresponding entry in the first         and Repeater nodes.
column. The numbers in the third column indicate how many
distinct sessions generated on said channel are present in the
                                                                     B. Channel Switching Delay
feedback logs.
                                                                        When user clicks on a new channel button, it takes some
                                                                     time (a.k.a. channel switching delay) for the user to be
A. Overlay Size and Sharing Ratio                                    able to start watching streamed video on Zattoo player. The
   We first study how many concurrent users are supported by          channel switching delay has two components. First, Zattoo
the Zattoo system, and how much bandwidth is contributed by          player needs to contact other available peers and retrieve all
them. For this purpose, we use the session database presented        required sub-streams from them. We call the delay incurred
in Table I. By using the join/leave timestamps of the collected      during this stage “join delay.” Once all necessary sub-streams
sessions, we calculate the number of concurrent users on a           have been negotiated successfully, the player then needs to
given channel at time i. Then we calculate the average sharing       wait and buffer the minimum amount of streams (e.g., 3
ratio of the given channel at the same time. The average             seconds) before starting to show the video to the user. We call
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
                                                                                                                                                                                                                                                                                            8




                 25000                                                    2                                    25000                                                    2                                          25000                                                    2
                                                 Overlay Size                                                                                  Overlay Size                                                                                        Overlay Size
                                                Sharing Ratio             1.8                                                                 Sharing Ratio             1.8                                                                       Sharing Ratio             1.8
                 20000                                                    1.6                                  20000                                                    1.6                                        20000                                                    1.6
                                                                          1.4                                                                                           1.4                                                                                                 1.4




                                                                                Sharing Ratio




                                                                                                                                                                              Sharing Ratio




                                                                                                                                                                                                                                                                                  Sharing Ratio
  Overlay Size




                                                                                                Overlay Size




                                                                                                                                                                                                    Overlay Size
                 15000                                                    1.2                                  15000                                                    1.2                                        15000                                                    1.2
                                                                          1                                                                                             1                                                                                                   1
                 10000                                                    0.8                                  10000                                                    0.8                                        10000                                                    0.8
                                                                          0.6                                                                                           0.6                                                                                                 0.6
                 5000                                                     0.4                                  5000                                                     0.4                                        5000                                                     0.4
                                                                          0.2                                                                                           0.2                                                                                                 0.2
                       0                                                  0                                          0                                                  0                                                0                                                  0
                           0       5       10     15     20     25   30                                                  0       5       10     15     20     25   30                                                        0       5       10     15     20     25   30
                                           Day in 2008/6                                                                                 Day in 2008/6                                                                                       Day in 2008/6
                                           (a) ARD                                                                                   (b) Cuatro                                                                                              (c) SF2
 Fig. 5.                   Overlay size and sharing ratio.


                  1                                                                                             1                                                                                                   1
                                             Feedback Logs                                                                                 Feedback Logs                                                                                       Feedback Logs
                                           Session Database                                                                              Session Database                                                                                    Session Database
                 0.8                                                                                           0.8                                                                                                 0.8


                 0.6                                                                                           0.6                                                                                                 0.6
  CDF




                                                                                                CDF




                                                                                                                                                                                                    CDF
                 0.4                                                                                           0.4                                                                                                 0.4


                 0.2                                                                                           0.2                                                                                                 0.2


                  0                                                                                             0                                                                                                   0
                       0       2   4   6    8   10 12 14 16 18 20 22 24                                              0       2   4   6    8   10 12 14 16 18 20 22 24                                                    0       2   4   6    8   10 12 14 16 18 20 22 24
                                                Arrival Hour                                                                                  Arrival Hour                                                                                        Arrival Hour
                                           (a) ARD                                                                                   (b) Cuatro                                                                                              (c) SF2
 Fig. 6.                   CDF of user arrival time.

                                                                                                                                                                                                       TABLE IV
 the resulting wait time “buffering delay.” The total channel                                                                                                                                 M EDIAN CHANNEL JOIN DELAY.
 switching delay experienced by users is thus the sum of join                                                                                                                      Median join delay                                     Maximum overlay size
                                                                                                                                                              Channel
 delay and buffering delay. PPLive reports channel switching                                                                                                                      Off-peak     Peak                                      Off-peak    Peak
 delay around 20 to 30 seconds, but can be as high as 2 minutes,                                                                                               ARD                2.29 sec   1.96 sec                                     2,313     19,223
 of which join delay typically accounts for 10 to 15 seconds [7].                                                                                              Cuatro             3.67 sec   4.48 sec                                     2,357      8,073
                                                                                                                                                                SF2               2.49 sec   2.67 sec                                     1,126     11,360
    We measure the join delay experienced by Zattoo users from
 the feedback logs described in Table II. Debugging informa-
 tion contained in the feedback logs tells us when user clicked                                                                                   hours, which indicates that feedback submission rate during
 on a particular channel, and when the player has successfully                                                                                    off-peak hours on SF2 was relatively lower than normal. Later
 joined the P2P overlay and starting to buffer content. One                                                                                       during peak hours, however, feedback submission rate picks
 concern in relying on user-submitted feedback logs to infer                                                                                      up as expected, closely matching the actual user arrival rate.
 join delay is the potential sampling bias associated with them.                                                                                  Based on this observation, we argue that feedback logs can
 Users typically submit feedback logs when they encounter                                                                                         serve as representative samples of daily user activities.
 some kind of errors, and that brings up the question of whether                                                                                     Fig. 7 shows the CDF distributions of channel join delay
 the captured sessions are representative samples to study.                                                                                       for ARD, Cuatro and SF2 channels. We show the distributions
 We attempt to address this concern by comparing the data                                                                                         for off-peak hours (0AM-6PM) and peak hours (6PM-0AM)
 from feedback logs against those from the session database.                                                                                      separately. Median channel join delay is also presented in a
 The latter captures the complete picture of user’s channel                                                                                       similar fashion in Table IV. According to the CDF distribu-
 watching behavior, and therefore can serve as a reference. In                                                                                    tions, 80% of users experience less than 4 to 8 seconds of
 our analysis, we compare the user arrival time distribution                                                                                      join delay, and 50% of users even less than 2 seconds of join
 obtained from the two data sets. For fair comparison, we used                                                                                    delay. Also, Table IV shows that even a 10-fold increase on
 a subset of the session database which was generated during                                                                                      the number of concurrent users during peak hours does not
 the same period when the feedback logs were collected (i.e.,                                                                                     unduly lengthen the channel join delay (up to 22% increase
 from June 20th to 29th).                                                                                                                         in median join delay).
    Fig. 6 plots the CDF distribution of user arrivals per hour
 obtained from feedback logs and session database separately.
                                                                                                                                                  C. Repeater Node Assignment
 The steep slope of the distributions during hour 18-20 (6-
 8PM) indicates the high frequency of user arrivals during                                                                                           As illustrated in Fig. 5, the live coverage of Euro Cup games
 those hours. On ARD and Cuatro, the user arrival distributions                                                                                   brought huge flash crowds to the Zattoo system. With typical
 inferred from feedback logs are almost identical to those from                                                                                   users contributing only about 25–30% of the average streaming
 session database. On the other hand, on SF2, the distribution                                                                                    bitrate, Zattoo must subsidize the balance. As described in
 obtained from feedback logs tends to grow slowly during early                                                                                    Section III, the Zattoo’s Subsidy System assigns Repeater
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
                                                                                                                                                                                                                                                                                                                            9




                 1                                                                                                           1                                                                                                           1
                0.9                                                                                                         0.9                                                                                                         0.9
                0.8                                                                                                         0.8                                                                                                         0.8
                0.7                                                                                                         0.7                                                                                                         0.7
                0.6                                                                                                         0.6                                                                                                         0.6
 CDF




                                                                                                             CDF




                                                                                                                                                                                                                         CDF
                0.5                                                                                                         0.5                                                                                                         0.5
                0.4                                                                                                         0.4                                                                                                         0.4
                0.3                                                                                                         0.3                                                                                                         0.3
                0.2                                                                                                         0.2                                                                                                         0.2
                0.1                               Off-Peak Hours                                                            0.1                               Off-Peak Hours                                                            0.1                               Off-Peak Hours
                                                      Peak Hours                                                                                                  Peak Hours                                                                                                  Peak Hours
                 0                                                                                                           0                                                                                                           0
                      0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30                                                                  0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30                                                                  0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
                                    Channel Join Delay (Sec)                                                                                    Channel Join Delay (Sec)                                                                                    Channel Join Delay (Sec)
                                      (a) ARD                                                                                                   (b) Cuatro                                                                                                     (c) SF2
Fig. 7.                CDF of channel join delay.


                                                                        200                                                                                                         200                                                                                                         200
                25000                      Overlay Size                                                                     25000                      Overlay Size                                                                     25000                      Overlay Size
                                                                        180                                                                                                         180                                                                                                         180
                                                                              Number of Repeaters Assigned




                                                                                                                                                                                          Number of Repeaters Assigned




                                                                                                                                                                                                                                                                                                      Number of Repeaters Assigned
                                    Number of Repeaters                                                                                         Number of Repeaters                                                                                         Number of Repeaters
                                                                        160                                                                                                         160                                                                                                         160
                20000                                                                                                       20000                                                                                                       20000
                                                                        140                                                                                                         140                                                                                                         140
 Overlay Size




                                                                                                             Overlay Size




                                                                                                                                                                                                                         Overlay Size
                                                                        120                                                                                                         120                                                                                                         120
                15000                                                                                                       15000                                                                                                       15000
                                                                        100                                                                                                         100                                                                                                         100
                                                                        80                                                                                                          80                                                                                                          80
                10000                                                                                                       10000                                                                                                       10000
                                                                        60                                                                                                          60                                                                                                          60
                5000                                                    40                                                  5000                                                    40                                                  5000                                                    40
                                                                        20                                                                                                          20                                                                                                          20
                      0                                                 0                                                         0                                                 0                                                         0                                                 0
                          18   19     20     21       22   23      24                                                                 12   14     16     18       20   22      24                                                                 18   19     20     21       22   23      24
                                        Hour of Day                                                                                                 Hour of Day                                                                                                 Hour of Day
                                      (a) ARD                                                                                                   (b) Cuatro                                                                                                     (c) SF2
Fig. 8.                Overlay size and channel provisioning.


nodes to channels that require more bandwidth than its own                                                                                                   As described in Section II-A, Zattoo’s peer discovery is
globally available aggregate upload bandwidth.                                                                                                             guided by peer’s topology information. To minimize potential
  Fig. 8 shows the Subsidy System assigning more Repeater                                                                                                  sampling bias caused by our use of single vantage point for
nodes to a channel as flash crowds arrived during each Euro                                                                                                 monitoring, we assigned “empty” AS number and country
Cup game and then gradually reclaiming them as the flash                                                                                                    code to our monitoring clients, so that their probing is not
crowd departed. For each of the channel reported, we choose                                                                                                geared towards those peers located in the same AS and
a particular date with the biggest flash crowd on the channel.                                                                                              country.
The dip in overlay sizes on ARD and Cuatro channels occurred
during the half-time break of the games. The Subsidy Server
was less aggressive in assigning Repeater nodes to the ARD                                                                                                 A. Sub-Stream Synchrony
channel, as compared to the other two channels, because the                                                                                                   To ensure good viewing quality, peer should not only
Repeater nodes in the vicinity of the ARD server have higher                                                                                               obtain all necessary sub-streams (discounting redundant sub-
capacity than those near the other two.                                                                                                                    streams), but also have those sub-streams delivered temporally
                                                                                                                                                           synchronized with each other for proper online decoding. Re-
                                                                                                                                                           ceiving out-of-sync sub-streams typically results in pixelated
                                V. C LIENT- SIDE M EASUREMENTS
                                                                                                                                                           screen on the player. As described in Sections 1 and II-C,
   To further study the P2P overlay beyond details obtain-                                                                                                 Zattoo’s protocol favors sub-streams that are relatively in-
able from aggregated session-level statistics, we run several                                                                                              sync when constructing the PDM, and continually monitors
modified Zattoo clients which periodically retrieve the internal                                                                                            the sub-streams’ progression over time, replacing those sub-
states of other participating peers in the network by exchang-                                                                                             streams that have fallen behind and reconfiguring the PDM
ing SEARCH/JOIN messages with them. After a given probe                                                                                                    when necessary. In this section we measure the effectiveness
session is over, the monitoring client archives a log file where                                                                                            of Zattoo’s Adaptive PDM in selecting sub-streams that are
we can analyze control/data traffic exchanged and detailed                                                                                                  largely in-sync.
protocol behavior. We run the experiment during Zattoo’s live                                                                                                 To quantify such inter-sub stream synchrony, we measure
coverage of Euro 2008 (June 7th to 29th). The monitoring                                                                                                   the difference in the latest (i.e., maximum) packet sequence
clients tuned to game channels from one of Zattoo’s data                                                                                                   numbers belonging to different incoming sub-streams. When a
centers located in Switzerland while the games were broadcast                                                                                              remote peer responds to a SEARCH query message, it includes
live. The data sets presented in this paper were collected                                                                                                 in its SEARCH reply the latest sequence numbers that it has
during the coverage of the championship final on two separate                                                                                               received for all sub-streams. If some sub-streams happen to be
channels: ARD in Germany and Cuatro in Spain. Soccer teams                                                                                                 lossy or stalled at that time, the peer marks such sub-streams
from Germany and Spain participated in the championship                                                                                                    in its SEARCH replies. Thus, we can inspect SEARCH replies
final.                                                                                                                                                      from existing peers to study their inter-sub stream synchrony.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011                                                                                10




           1                                                                          1
          0.9                                                                        0.9
          0.8                                                                        0.8
          0.7                                                                        0.7
          0.6                                                                        0.6




                                                                               CDF
 CDF




          0.5                                                                        0.5
          0.4                                                                        0.4
          0.3                                                                        0.3
          0.2                                                                        0.2
          0.1                 Euro 2008 final on ARD                                 0.1                   Euro 2008 final on ARD
                             Euro 2008 final on Cuatro                                                    Euro 2008 final on Cuatro
           0                                                                          0
                0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16                                   0   100 200 300 400 500 600 700 800 900 1000
                        Number of Bad Sub-streams                                                Sub-stream Synchrony (# of Packets)
           (a) CDF for number of bad sub-streams                                           (b) CDF for sub-stream synchrony
Fig. 9.     Sub-stream synchrony.

   In our experiment, we collected SEARCH replies from                          1
                                                                                     Euro 2008 final on Cuatro
4,420 and 6,530 distinct peers from ARD and Cuatro channels                   0.9     Euro 2008 final on ARD
respectively, during the 2-hour coverage of the final game.                    0.8
From the collected SEARCH replies, we check how many                          0.7
sub-streams (out of 16) are “bad” (e.g., lossy or missing) for                0.6
each peer. Fig. 9(a) shows the CDF distribution of the number
                                                                       CDF
                                                                              0.5
of bad sub-streams. According to the figure, about 99% (ARD)
                                                                              0.4
and 96% (Cuatro) of peers have 3 or less bad sub-streams.
                                                                              0.3
                                                                              0.2
  Current Zattoo deployment dedicates k = 3 sub-streams                       0.1
(out of n = 16) for loss recovery purposes. That is, given a                    0
segment of 16 consecutive packets, if peer has received at least                -350 -300 -250 -200 -150 -100 -50            0    50   100
13 packets, it can reconstruct the remaining 3 packets from                                Relative Peer Synchrony (# of Packets)
the RS error correcting code (see Section 1). Thus if peer can     Fig. 10.   Peer synchrony.
receive any 13 sub-streams out of 16 reliably, it can decode
the full stream properly. The result in Fig. 9 (a) suggests that   B. Peer Synchrony
the number of bad sub-streams is low enough as to not cause           While sub-stream synchrony tells us stream quality different
quality issues in the Zattoo network.                              peers may experience, “peer synchrony” tells us how varied in
                                                                   time peers’ viewing points are. With small scale P2P networks,
   After discounting “bad” sub-streams, we then look at the        all participating peers are likely to watch live streaming
synchrony of the remaining “good” sub-streams in each peer.        roughly synchronized in time. However, as the size of the
Fig. 9(b) shows the CDF distribution of the sub-stream syn-        P2P overlay grows, the viewing point of edge nodes may be
chrony in the two channels. Sub-stream synchrony of a given        delayed significantly compared to those closer to the Encoding
peer is defined as the difference between maximum and mini-         Server. In the experiment, we define the viewing point of a
mum packet sequence numbers among all sub-streams, which           peer as the median of the latest sequence numbers across
is obtained from the peer’s SEARCH reply. For example, if          its sub-streams. Then we choose one peer (e.g., a Repeater
some peer has sub-stream synchrony measured at 100, it means       node directly connected to the Encoding Server) as a reference
that the peer has one sub-stream that is ahead of another sub-     point, and compare other peers’ viewing point against the
stream by 100 packets. If all the packets are received in order,   reference viewing point.
the sub-stream synchrony of a peer measures at most n − 1.            Fig. 10 shows the CDFs of relative peer synchrony. The
If we received multiple SEARCH replies from the same peer,         relative peer synchrony of peer X is obtained by the viewing
we average the sub-stream synchrony across all the replies.        point of X subtracted by the reference viewing point. So peer
Given the 500 kbps average channel data rate, 60 consecutive       synchrony at -60 means that a given peer’s viewing point
packets roughly correspond to 1-second worth of streaming.         is delayed by 60 packets (roughly 1 second for a 500 kbps
Thus, the figure shows that on Cuatro channel, 20% of peers         stream) compared to the reference viewing point. A positive
have their sub-streams completely in-sync, while more than         viewing point means that a given peer’s stream gets ahead
90% have their sub-streams lagging each other by at most           of the reference point, which could happen for peers which
5 seconds; on ARD channel, 30% are in-sync, and more than          receive streams directly from the Encoding Server. The figure
90% are within 1.5 seconds. The buffer space of Zattoo player      shows that about 1% of peers on ARD and 4% of peers on
has been dimensioned sufficiently to accommodate such low           Cuatro experienced more than 3 seconds (i.e., 180 packets)
degree of out-of-sync sub-streams.                                 delay in streaming compared to the reference viewing point.
Live streaming with receiver based peer-division multiplexing
Live streaming with receiver based peer-division multiplexing
Live streaming with receiver based peer-division multiplexing
Live streaming with receiver based peer-division multiplexing

More Related Content

More from ingenioustech

Impact of le arrivals and departures on buffer
Impact of  le arrivals and departures on bufferImpact of  le arrivals and departures on buffer
Impact of le arrivals and departures on bufferingenioustech
 
Exploiting dynamic resource allocation for
Exploiting dynamic resource allocation forExploiting dynamic resource allocation for
Exploiting dynamic resource allocation foringenioustech
 
Efficient computation of range aggregates
Efficient computation of range aggregatesEfficient computation of range aggregates
Efficient computation of range aggregatesingenioustech
 
Dynamic measurement aware
Dynamic measurement awareDynamic measurement aware
Dynamic measurement awareingenioustech
 
Design and evaluation of a proxy cache for
Design and evaluation of a proxy cache forDesign and evaluation of a proxy cache for
Design and evaluation of a proxy cache foringenioustech
 
Throughput optimization in
Throughput optimization inThroughput optimization in
Throughput optimization iningenioustech
 
Phish market protocol
Phish market protocolPhish market protocol
Phish market protocolingenioustech
 
Peering equilibrium multi path routing
Peering equilibrium multi path routingPeering equilibrium multi path routing
Peering equilibrium multi path routingingenioustech
 
Online social network
Online social networkOnline social network
Online social networkingenioustech
 
On the quality of service of crash recovery
On the quality of service of crash recoveryOn the quality of service of crash recovery
On the quality of service of crash recoveryingenioustech
 
Bayesian classifiers programmed in sql
Bayesian classifiers programmed in sqlBayesian classifiers programmed in sql
Bayesian classifiers programmed in sqlingenioustech
 
Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]
Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]
Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]ingenioustech
 
Applied research of e learning
Applied research of e learningApplied research of e learning
Applied research of e learningingenioustech
 
Active reranking for web image search
Active reranking for web image searchActive reranking for web image search
Active reranking for web image searchingenioustech
 
A dynamic performance-based_flow_control
A dynamic performance-based_flow_controlA dynamic performance-based_flow_control
A dynamic performance-based_flow_controlingenioustech
 

More from ingenioustech (20)

Impact of le arrivals and departures on buffer
Impact of  le arrivals and departures on bufferImpact of  le arrivals and departures on buffer
Impact of le arrivals and departures on buffer
 
Exploiting dynamic resource allocation for
Exploiting dynamic resource allocation forExploiting dynamic resource allocation for
Exploiting dynamic resource allocation for
 
Efficient computation of range aggregates
Efficient computation of range aggregatesEfficient computation of range aggregates
Efficient computation of range aggregates
 
Dynamic measurement aware
Dynamic measurement awareDynamic measurement aware
Dynamic measurement aware
 
Design and evaluation of a proxy cache for
Design and evaluation of a proxy cache forDesign and evaluation of a proxy cache for
Design and evaluation of a proxy cache for
 
Throughput optimization in
Throughput optimization inThroughput optimization in
Throughput optimization in
 
Tcp
TcpTcp
Tcp
 
Privacy preserving
Privacy preservingPrivacy preserving
Privacy preserving
 
Phish market protocol
Phish market protocolPhish market protocol
Phish market protocol
 
Peering equilibrium multi path routing
Peering equilibrium multi path routingPeering equilibrium multi path routing
Peering equilibrium multi path routing
 
Peace
PeacePeace
Peace
 
Online social network
Online social networkOnline social network
Online social network
 
On the quality of service of crash recovery
On the quality of service of crash recoveryOn the quality of service of crash recovery
On the quality of service of crash recovery
 
Layered approach
Layered approachLayered approach
Layered approach
 
Intrution detection
Intrution detectionIntrution detection
Intrution detection
 
Bayesian classifiers programmed in sql
Bayesian classifiers programmed in sqlBayesian classifiers programmed in sql
Bayesian classifiers programmed in sql
 
Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]
Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]
Conditional%20 shortest%20path%20routing%20in%20delay%20tolerant%20networks[1]
 
Applied research of e learning
Applied research of e learningApplied research of e learning
Applied research of e learning
 
Active reranking for web image search
Active reranking for web image searchActive reranking for web image search
Active reranking for web image search
 
A dynamic performance-based_flow_control
A dynamic performance-based_flow_controlA dynamic performance-based_flow_control
A dynamic performance-based_flow_control
 

Recently uploaded

JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 

Recently uploaded (20)

JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 

Live streaming with receiver based peer-division multiplexing

  • 1. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 1 Live Streaming with Receiver-based Peer-division Multiplexing Hyunseok Chang Sugih Jamin Wenjie Wang Abstract—A number of commercial peer-to-peer systems for soon as the application assesses it has sufficient data buffered live streaming have been introduced in recent years. The behavior that, given the estimated download rate and the playback rate, of these popular systems has been extensively studied in several it will not deplete the buffer before the end of file. If this measurement papers. Due to the proprietary nature of these commercial systems, however, these studies have to rely on a assessment is wrong, the application would have to either “black-box” approach, where packet traces are collected from pause playback and rebuffer, or slow down playback. While a single or a limited number of measurement points, to infer users would like playback to start as soon as possible, the various properties of traffic on the control and data planes. application has some degree of freedom in trading off playback Although such studies are useful to compare different systems start time against estimated network capacity. Most video-on- from end-user’s perspective, it is difficult to intuitively under- stand the observed properties without fully reverse-engineering demand systems are examples of delay-sensitive progressive- the underlying systems. In this paper we describe the network download application. The third case, real-time live streaming, architecture of Zattoo, one of the largest production live stream- has the most stringent delay requirement. While progressive ing providers in Europe at the time of writing, and present a download may tolerate initial buffering of tens of seconds or large-scale measurement study of Zattoo using data collected by even minutes, live streaming generally cannot tolerate more the provider. To highlight, we found that even when the Zattoo system was heavily loaded with as high as 20,000 concurrent than a few seconds of buffering. Taking into account the users on a single overlay, the median channel join delay remained delay introduced by signal ingest and encoding, and network less than 2 to 5 seconds, and that, for a majority of users, the transmission and propagation, the live streaming system can streamed signal lags over-the-air broadcast signal by no more introduce only a few seconds of buffering time end-to-end and than 3 seconds. still be considered “live” [1]. Index Terms—Peer-to-peer system, live streaming, network The Zattoo peer-to-peer live streaming system was a free- architecture to-use network serving over 3 million registered users in eight European countries at the time of study, with a maximum I. I NTRODUCTION of over 60,000 concurrent users on a single channel. The system delivers live streams using a receiver-based, peer- T HERE is an emerging market for IPTV. Numerous com- mercial systems now offer services over the Internet that are similar to traditional over-the-air, cable, or satellite TV. division multiplexing scheme as described in Section II. To ensure real-time performance when peer uplink capacity is below requirement, Zattoo subsidizes the network’s bandwidth Live television, time-shifted programming, and content-on- requirement, as described in Section III. After delving into demand are all presently available over the Internet. Increased Zattoo’s architecture in detail, we study in Sections IV and V broadband speed, growth of broadband subscription base, and large-scale measurements collected during the live broadcast improved video compression technologies have contributed to of the UEFA European Football Championship, one of the the emergence of these IPTV services. most popular one-time events in Europe, in June, 2008 [2]. We draw a distinction between three uses of peer-to-peer During the course of the month of June 2008, Zattoo served (P2P) networks: delay tolerant file download of archival ma- more than 35 million sessions to more than one million distinct terial, delay sensitive progressive download (or streaming) of users. Drawing from these measurements, we report on the archival material, and real-time live streaming. In the first operational scalability of Zattoo’s live streaming system along case, the completion of download is elastic, depending on several key issues: available bandwidth in the P2P network. The application buffer receives data as it trickles in and informs the user 1) How does the system scale in terms of overlay size and upon the completion of download. The user can then start its effectiveness in utilizing peers’ uplink bandwidth? playing back the file for viewing in the case of a video 2) How responsive is the system during channel switching, file. Bittorrent and variants are example of delay-tolerant file for example, when compared to the 3-second channel download systems. In the second case, video playback starts as switch time of satellite TV? 3) How effective is the packet retransmission scheme in H. Chang is with Alcatel-Lucent Bell Labs, Holmdel, NJ 07733 USA (e- mail: hyunseok.chang@alcatel-lucent.com). allowing a peer to recover from transient congestion? S. Jamin is with EECS Department, University of Michigan, Ann Arbor, 4) How effective is the receiver-based peer-division multi- MI 48109 USA (e-mail: jamin@eecs.umich.edu). plexing scheme in delivering synchronized sub-streams? W. Wang is with IBM Ressearch, CRL, Beijing 100193, China (e-mail: wenjwang@cn.ibm.com). 5) How effective is the global bandwidth subsidy system This work was done when authors Chang and Wang were at Zattoo Inc. in provisioning for flash crowd scenarios?
  • 2. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 2 6) Would a peer further away from the stream source experience adversely long lag compared to a peer closer to the stream source? 7) How effective is error-correcting code in isolating packet Encoding losses on the overlay? Servers We also discuss in Section VI several challenges in in- Demultiplexer creasing the bandwidth contribution of Zattoo peers. Finally, we describe related works in Section VII and conclude in Section VIII. Authentication Server Rendezvous Server II. S YSTEM A RCHITECTURE Feedback Server other admin servers The Zattoo system rebroadcasts live TV, captured from satellites, onto the Internet. The system carries each TV channel on a separate peer-to-peer delivery network and is not Fig. 1. Zattoo delivery network architecture. limited in the number of TV channels it can carry. Although a peer can freely switch from one TV channel to another, and a “segment.” Thus m serves as the segment index, while i thereby departing and joining different peer-to-peer networks, serves as the packet index within a segment. Each segment it can only join one peer-to-peer network at any one time. is of size n packets. Being the packet index, i also serves as We henceforth limit our description of the Zattoo delivery the sub-stream index. The number mn + i is carried in each network as it pertains to carrying one TV channel. Fig. 1 packet as its sequence number. shows a typical setup of a single TV channel carried on the Zattoo uses the Reed-Solomon (RS) error correcting code Zattoo network. TV signal captured from satellite is encoded (ECC) for forward error correction. The RS code is a sys- into H.264/AAC streams, encrypted, and sent onto the Zattoo tematic code: of the n packets sent per segment, k < n network. The encoding server may be physically separated packets carry the live stream data while the remainder carries from the server delivering the encoded content onto the Zattoo the redundant data [3, Section 7.3]. Due to the variable-bit network. For ease of exposition, we will consider the two as rate nature of the data stream, the time period covered by a logically co-located on an Encoding Server. Users are required segment is variable, and a packet may be of size less than the to register themselves at the Zattoo website to download a free maximum packet size. A packet smaller than the maximum copy of the Zattoo player application. To receive the signal packet size is zero-padded to the maximum packet size for of a channel, the user first authenticates itself to the Zattoo the purposes of computing the (shortened) RS code, but is Authentication Server. Upon authentication, the user is granted transmitted in its original size. Once a peer has received k a ticket with limited lifetime. The user then presents this ticket, packets per segment, it can reconstruct the remaining n − k along with the identity of the TV channel of interest, to the packets. We do not differentiate between streaming data and Zattoo Rendezvous Server. If the ticket specifies that the user redundant data in our discussion in the remainder of this paper. is authorized to receive signal of the said TV channel, the When a new peer requests to join an existing peer, it Rendezvous Server returns to the user a list of peers currently specifies the sub-stream(s) it would like to receive from the joined to the P2P network carrying the channel, together with existing peer. These sub-streams do not have to be consecutive. a signed channel ticket. If the user is the first peer to join a Contingent upon availability of bandwidth at existing peers, channel, the list of peers it receives contain only the Encoding the receiving peer decides how to multiplex a stream onto Server. The user joins the channel by contacting the peers its set of neighboring peers, giving rise to our description of returned by the Rendezvous Server, presenting its channel the Zattoo live streaming protocol as a receiver-based, peer- ticket, and obtaining the live stream of the channel from them division multiplexing protocol. The details of peer-division (see Section II-A for details). multiplexing is described in Section II-A while the details of Each live stream is sent out by the Encoding Server as how a peer manages sub-stream forwarding and stream recon- n logical sub-streams. The signal received from satellite is struction is described in Section II-B. Receiver-based peer- encoded into a variable-bit rate stream. During periods of division multiplexing has also been used by the latest version source quiescence, no data is generated. During source busy of CoolStreaming peer-to-peer protocol though it differs from periods, generated data is packetized into a packet stream, Zattoo in its stream management (Section II-B) and adaptive with each packet limited to a maximum size. The Encoding behavior (Section II-C) [4]. Server multiplexes this packet stream onto the Zattoo network as n logical sub-streams. Thus the first packet generated is considered part of the first sub-stream, the second packet that A. Peer-Division Multiplexing of the second sub-stream, the n-th packet that of the n-th sub- To minimize per-packet processing time of a stream, the stream. The n+1-th packet cycles back to the first sub-stream, Zattoo protocol sets up a virtual circuit with multiple fan outs etc. such that the i-th sub-stream carries the mn+i-th packets, at each peer. When a peer joins a TV channel, it establishes where m ≥ 0, 1 ≤ i ≤ n, and n a user-defined constant. a peer-division multiplexing (PDM) scheme amongst a set of We call a set of n packets with the same index multiplier m neighboring peers, by building a virtual circuit to each of the
  • 3. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 3 neighboring peers. Baring departure or performance degrada- tion of a neighbor peer, the virtual circuits are maintained until the joining peer switches to another TV channel. With the virtual circuits set up, each packet is forwarded without further per-packet handshaking between peers. We describe the PDM boot strapping mechanism in this section and the adaptive PDM mechanism to handle peer departure and performance degradation in Section II-C. The PDM establishment process consists of two phases: PDM the search phase and the join phase. In the search phase, the new, joining peer determines its set of potential neighbors. In the join phase, the joining peer requests peering relationships with a subset of its potential neighbors. Upon acceptance of a peering relationship request, the peers become neighbors and a virtual circuit is formed between them. Zattoo Search phase. To obtain a list of potential neighbors, a Peer and Player joining peer sends out a SEARCH message to a random subset Application of the existing peers returned by the Rendezvous Server. The SEARCH message contains the sub-stream indices for which this joining peer is looking for peering relationships. The sub- stream indices is usually represented as a bitmask of n bits, where n is the number of sub-streams defined for the TV Fig. 2. Zattoo peer with IOB. channel. In the beginning, the joining peer will be looking for peering relationships for all sub-streams and have all the bits in the bitmask turned on. In response to a SEARCH message, to be its subnet number, autonomous system (AS) number, an existing peer replies with the number of sub-streams it can and country code, in that order of precedence. A joining forward. From the returning SEARCH replies, the joining peer peer obtains its own topological location from the Zattoo constructs a set of potential neighbors that covers the full set of Authentication Server as part of its authentication process. sub-streams comprising the live stream of the TV channel. The The list of peers returned by both the Rendezvous Server joining peer continues to wait for SEARCH replies until the and potential neighbors all come attached with topological set of potential neighbors contains at least a minimum number locations. A topology-aware overlay not only allows us to of peers, or until all SEARCH replies have been received. be “ISP-friendly,” by minimizing inter-domain traffic and With each SEARCH reply, the existing peer also returns a thus save on transit bandwidth cost, but also helps reduce random subset of its known peers. If a joining peer cannot the number of physical links and metro hops traversed in form a set of potential neighbors that covers all of the sub- the overlay network, potentially resulting in enhanced user- streams of the TV channel, it initiates another SEARCH round, perceived stream quality. sending SEARCH messages to peers newly learned from the previous round. The joining peer gives up if it cannot obtain B. Stream Management the full stream after two SEARCH rounds. To help the joining We represent a peer as a packet buffer, called the IOB, peer synchronize the sub-streams it receives from multiple fed by sub-streams incoming from the PDM constructed as peers, each existing peer also indicates for each sub-stream described in Section II-A.1 The IOB drains to (1) a local the latest sequence number it has received for that sub-stream, media player if one is running, (2) a local file if recording and the existence of any quality problem. The joining peer can is supported, and (3) potentially other peers. Fig. 2 depicts then choose sub-streams with good quality that are closely a Zattoo player application with virtual circuits established to synchronized. four peers. As packets from each sub-stream arrive at the peer, Join phase. Once the set of potential neighbors is estab- they are stored in the IOB for reassembly to reconstruct the lished, the joining peer sends JOIN requests to each potential full stream. Portions of the stream that have been reconstructed neighbor. The JOIN request lists the sub-streams for which are then played back to the user. In addition to providing the joining peer would like to construct virtual circuit with the a reassembly area, the IOB also allows a peer to absorb potential neighbor. If a joining peer has l potential neighbors, some variabilities in available network bandwidth and network each willing to forward it the full stream of a TV channel, it delay. would typically choose to have each forward only 1/l-th of the The IOB is referenced by an input pointer, a repair pointer, stream, to spread out the load amongst the peers and to speed and one or more output pointers. The input pointer points up error recovery, as described in Section II-C. In selecting to the slot in the IOB where the next incoming packet with which of the potential neighbors to peer with, the joining sequence number higher than the highest sequence number peer gives highest preference to topologically close-by peers, even if these peers have less capacity or carry lower quality 1 In the case of the Encoding Server, which we also consider a peer on the sub-streams. The “topological” location of a peer is defined Zattoo network, the buffer is fed by the encoding process.
  • 4. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 4 1 4 9 14 n received so far will be stored. The repair pointer always points one slot beyond the last packet received in order and is segment 0 used to regulate packet retransmission and adaptive PDM as described later. We assign an output pointer to each forwarding segment m-1 destination. The output pointer of a destination indicates the SKIP NEED destination’s current forwarding horizon on the IOB. In accor- dance to the three types of possible forwarding destinations SENT READY listed above, we have three types of output pointers: player pointer, file pointer, and peer pointer. One would typically Fig. 3. Packet map associated with a peer pointer. have at most one player pointer and one file pointer but potentially multiple concurrent peer pointers, referencing an IOB segment size: n IOB. The Zattoo player application does not currently support repair File recording. pointer Packet Map Since we maintain the IOB as a circular buffer, if the segment 0 incoming packet rate is higher than the forwarding rate of a ... Peer0 particular destination, the input pointer will overrun the output Packet Map pointer of that destination. We could move the output pointer segment m-1 to match the input pointer so that we consistently forward the sub-buffer 0 Peer1 Packet oldest packet in the IOB to the destination. Doing so, however, Map segment 0 requires checking the input pointer against all output pointers on every packet arrival. Instead, we have implemented the IOB input ... Player Packet as a double buffer. With the double buffer, the positions of the pointer Map segment m-1 output pointers are checked against that of the input pointer sub-buffer 1 only when the input pointer moves from one sub-buffer to the other. When the input pointer moves from sub-buffer a to sub- received packet empty slot buffer b, all the output pointers still pointing to sub-buffer b are Fig. 4. IOB, input/output pointers and packet maps. moved to the start of sub-buffer a and sub-buffer b is flushed, ready to accept new packets. When a sub-buffer is flushed while there are still output pointers referencing it, packets that have not been forwarded to the destinations associate with The player pointer behaves the same as a peer pointer except those pointers are lost to them, resulting in quality degradation. that all packets in its packet map will always start out marked To minimize packet lost due to sub-buffer flushing, we would NEEDed. like to use large sub-buffers. However, the real-time delay Fig. 4 shows an IOB consisting of a double buffer, with an requirement of live streaming limits the usefulness of late input pointer, a repair pointer, and an output file pointer, an arriving packets and effectively puts a cap on the maximum output player pointer, and two output peer pointers referencing size of the sub-buffers. the IOB. Each output pointer has a packet map associated Different peers may request for different numbers of, possi- with it. For the scenario depicted in the figure, the player bly non-consecutive, sub-streams. To accommodate the differ- pointer tracks the input pointer and has skipped over some ent forwarding rates and regimes required by the destinations, lost packets. Both peer pointers are lagging the input pointer, we associate a packet map and forwarding discipline with each indicating that the forwarding rates to the peers are bandwidth output pointer. Fig. 3 shows the packet map associated with an limited. The file pointer is pointing at the first lost packet. output peer pointer where the peer has requested sub-streams Archiving a live stream to file does not impose real-time delay 1, 4, 9, and 14. Every time a peer pointer is repositioned bound on packet arrivals. To achieve the best quality recording to the beginning of a sub-buffer of the IOB, all the packet possible, a recording peer always waits for retransmission of slots of the requested sub-streams are marked NEEDed and lost packets that cannot be recovered by error correction. all the slots of the sub-streams not requested by the peer are In addition to achieving lossless recording, we use re- marked SKIP. When a NEEDed packet arrives and is stored transmission to let a peer recover from transient network in the IOB, its state in the packet map is changed to READY. congestion. A peer sends out a retransmission request when As the peer pointer moves along its associated packet map, the distance between the repair pointer and the input pointer READY packets are forwarded to the peer and their states has reached a threshold of R packet slots, usually spanning changed to SENT. A slot marked NEEDed but not READY, multiple segments. A retransmission request consists of an R- such as slot n + 4 in Fig. 3, indicates that the packet is lost bit packet mask, with each bit representing a packet, and the or will arrive out-of-order and is bypassed. When an out-of- sequence number of the packet corresponding to the first bit. order packet arrives, its slot is changed to READY and the Marked bits in the packet mask indicate that the corresponding peer pointer is reset to point to this slot. Once the out-of-order packets need to be retransmitted. When a packet loss is packet has been sent to the peer, the peer pointer will move detected, it could be caused by congestion on the virtual forward, bypassing all SKIP, NEED, and SENT slots until it circuits forming the current PDM or congestion on the path reaches the next READY slot, where it can resume sending. beyond the neighboring peers. In either case, current neighbor
  • 5. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 5 peers will not be good sources of retransmitted packets. Hence a virtual circuit. To reduce the instability introduced into the we send our retransmission requests to r random peers that are network, a peer closes first the virtual circuit carrying the not neighbor peers. A peer receiving a retransmission request smallest number of sub-streams. will honor the request only if the requested packets are still A peer attempts to increase its available uplink bandwidth in its IOB and it has sufficient left-over capacity, after serving estimate periodically: if it has fully utilized its current estimate its current peers, to transmit all the requested packets. Once a of available uplink bandwidth without triggering any bad retransmission request is accepted, the peer will retransmit all quality feedback from neighboring peers. A peer doubles the the requested packets to completion. estimated available uplink bandwidth if current estimate is below a threshold, switching to linear increase above the threshold, similar to how TCP maintains its congestion win- C. Adaptive PDM dow size. A peer also increases its estimate of available uplink While we rely on packet retransmission to recover from bandwidth if a neighbor peer departs the network without any transient congestions, we have two channel capacity adjust- bad quality feedback. ment mechanisms to handle longer-term bandwidth fluctua- When the repair pointer lags behind the input pointer by R tions. The first mechanism allows a forwarding peer to adapt packet slots, in addition to initiating a retransmission request, the number of sub-streams it will forward given its current a peer also computes a loss rate over the R packets. If the available bandwidth, while the second allows the receiving loss rate is above a threshold, the peer considers the neighbor peer to switch provider at the sub-stream level. slow and attempts to reconfigure its PDM. In reconfiguring Peers on the Zattoo network can redistribute a highly its PDM, a peer attempts to shift half of the sub-streams variable number of sub-streams, reflecting the high variability currently forwarded by the slow neighbor to other existing in uplink bandwidth of different access network technologies. neighbors. At the same time, it searches for new peer(s) to For a full-stream consisting of sixteen constant-bit rate sub- forward these sub-streams. If new peer(s) are found, the load streams, our prior study show that based on realistic peer will be shifted from existing neighbors to the new peer(s). characteristics measured from the Zattoo network, half of the If sub-streams from the slow neighbor continues to suffer peers can support less than half of a stream, 82% of peers can after the reconfiguration of the PDM, the peer will drop the support less than a full-stream, and the remainder can support neighbor completely and initiate another reconfiguration of the up to ten full streams (peers that can redistribute more than PDM. When a peer loses a neighbor due to reduced available a full stream is conventionally known as supernodes in the uplink bandwidth at the neighbor or due to neighbor departure, literature) [5]. With variable-bit rate streams, the bandwidth it also initiates a PDM reconfiguration. A peer may also carried by each sub-stream is also variable. To increase peer initiate a PDM reconfiguration to switch to a topologically bandwidth usage, without undue degradation of service, we closer peer. Similar to the PDM establishment process, PDM instituted measurement-based admission control at each peer. reconfiguration is accomplished by peers exchanging sub- In addition to controlling resource commitment, another goal stream bitmasks in a request/response handshake, with each of the measurement-based admission control module is to bit of the bitmask representing a sub-stream. During and after continually estimate the amount of available uplink bandwidth a PDM reconfiguration, slow neighbor detection is disabled at a peer. for a short period of time to allow for the system to stabilize. The amount of available uplink bandwidth at a peer is initially estimated by the peer sending a pair of probe packets III. G LOBAL BANDWIDTH S UBSIDY S YSTEM to Zattoo’s Bandwidth Estimation Server. Once a peer starts Each peer on the Zattoo network is assumed to serve a forwarding sub-streams to other peers, it will receive from user through a media player, which means that each peer those peers quality-of-service feedbacks that inform its update must receive, and can potentially forward, all n sub-streams of available uplink bandwidth estimate. A peer sends quality- of the TV channel the user is watching. The limited redistri- of-service feedback only if the quality of a sub-stream drops bution capacity of peers on the Zattoo network means that below a certain threshold.2 Upon receiving quality feedback a typical client can contribute only a fraction of the sub- from multiple peers, a peer first determines if the identified streams that make up a channel. This shortage of bandwidth sub-streams are arriving in low quality. If so, the low quality of leads to a global bandwidth deficit in the peer-to-peer net- service may not be caused by limit on its own available uplink work. Whereas bittorrent-like delay-tolerant file downloads or bandwidth; in which case, it ignores the low quality feedbacks. the delay-sensitive progressive download of video-on-demand Otherwise, the peer decrements its estimate of available uplink applications can mitigate such global bandwidth shortage by bandwidth. If the new estimate is below the bandwidth needed increasing download time, a live streaming system such as to support existing number of virtual circuits, the peer closes Zattoo’s must subsidize the bandwidth shortfall to provide real-time delivery guarantee. 2 Depending on a peer’s NAT and/or firewall configuration, Zattoo uses Zattoo’s Global Bandwidth Subsidy System (or simply, the either UDP or TCP as the underlying transport protocol. The quality of a sub- stream is measured differently for UDP and TCP. A packet is considered lost Subsidy System), consists of a global bandwidth monitoring under UDP if it doesn’t arrive within a fixed threshold. The quality measure subsystem, a global bandwidth forecasting and provisioning for UDP is computed as a function of both the packet lost rate and the burst subsystem, and a pool of Repeater nodes. The monitoring error rate (number of contiguous packet losses). The quality measure for TCP is defined to be how far behind a peer is, relative to other peers, in serving subsystem continuously monitors the global bandwidth re- its sub-streams. quirement of a channel. The forecasting and provisioning
  • 6. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 6 subsytem projects global bandwidth requirement based on • Stable: the ratio U has remained within [S- , S+ ] for measured history and allocates Repeater nodes to the chan- the past Ts reporting periods. nel as needed. The monitoring and provisioning of global • Exploding: the ratio U increased by at least E between bandwidth is complicated by two highly varying parameters Te (e.g., Te = 2) reporting periods. over time, client population size and peak streaming rate, and • Increasing: the ratio U has steadily increased by I (I one varying parameter over space, available uplink bandwidth, E) over the past Ti reporting periods. which is network-service provider dependent. Forecasting of • Falling: the ratio U has decreased by F over the past Tf bandwidth requirement is a vast subject in itself. Zattoo reporting periods. adopted a very simple mechanism, described in Section III-B Orthogonal to the capacity-trend based classification above, which has performed adequately in provisioning the network each channel is further categorized in terms of its capacity for both daily demand fluctuations and flash crowds scenarios utilization ratio as follows. (see Section IV-C). • Under-utilized: the ratio U is below the low threshold When a bandwidth shortage is projected for a channel, the L, e.g., U ≤ 0.5. Subsidy System assigns one or more Repeater nodes to the • Near Capacity: the ratio U is almost 1.0, e.g., U ≥ 0.9. channel. Repeater nodes function as bandwidth multiplier, to If the capacity trend of a channel has been “Exploding,” one amplify the amount of available bandwidth in the network. or more Repeater nodes will be assigned to it immediately. If Each Repeater node serves at most one channel at a time; the channel’s capacity trend has been “Increasing,” it will be it joins and leaves a given channel at the behest of the assigned a Repeater node with a smaller capacity. The goal Subsidy System. Repeater nodes receive and serve all n of the Subsidy System is to keep a channel below “Near sub-streams of the channel they join, run the same PDM Capacity.” If a channel’s capacity trend is “Stable” and the protocol, and are treated by actual peers like any other peers channel is “Under-utilized,” the Subsidy System attempts to on the network; however, as bandwidth amplifiers, they are free Repeater nodes (if any) assigned to the channel. If a usually provisioned to contribute more uplink bandwidth than channel’s capacity utilization is “Falling,” the Subsidy System the download bandwidth they consume. The use of Repeater waits for the utilization to stabilize before reassigning any nodes makes the Zattoo network a hybrid P2P and content Repeater nodes. distribution network. Each Repeater node periodically sends a keep-alive message We next describe the bandwidth monitoring subsystem of to the Subsidy System. The keep-live message tells the Sub- the Subsidy System, followed by design of the simple band- sidy System which channel the Repeater node is serving, plus width projection and Repeater node assignment subsystem. its CPU and capacity utilization ratio. This allows the Subsidy System to monitor the health of Repeater nodes and to increase A. Global Bandwidth Measurement the stability of the overlay during Repeater reassignment. When reassigning Repeater nodes from a channel, the Subsidy The capacity metric of a channel is the tuple (D, C), System will start from the Repeater node with the lowest where D is the aggregate download rates required by all utilization ratio. It will notify the selected Repeater node to users on the channel, and C is the aggregate upload capacity stop accepting new peers and then to leave the channel after of those users. Usually C < D and the difference between a specified time. the two is the global bandwidth deficit of the channel. Since In addition to Repeater nodes, the Subsidy System may channel viewership changes over time as users join or leave recognize extra capacity from idle peers whose owners are the channel, we need a scalable means to measure and update not actively watching a channel. However, our previous study the capacity metric. We rely on aggregating the capacity shows that a large number of idle peers are required to make metric up the peer division multiplexing tree. Each peer in the any discernible impact on the global bandwidth deficit of a overlay periodically aggregates the capacity metric reported channel [5]. Our current Subsidy System therefore does not by all its downstream receiver peers, adds its own capacity solicit bandwidth contribution from idle peers. measure (D, C) to the aggregate, and forwards the resulting capacity metric upstream to its forwarding peers. By the time the capacity metric percolates up to the Encoding Server, it IV. S ERVER - SIDE M EASUREMENTS contains the total download and upload rate aggregates of the In the Zattoo system, two separate centralized collector whole streaming overlay. The Encoding Server then simply servers collect usage statistics and error reports, which we forwards the obtained (D, C) to the Subsidy Server. call the “stats” server and the “user-feedback” server re- spectively. The “stats” server periodically collects aggregated player statistics from individual peers, from which full session B. Global Bandwidth Projection and Provisioning logs are constructed and entered into a session database. For each channel, the Subsidy Server keeps a history of the The session database gives a complete picture of all past capacity metric (D, C) reports received from the channel’s and present sessions served by the Zattoo system. A given Encoding Server. The channel utilization ratio (U ) is the ratio database entry contains statistics about a particular session, D over C. Based on recent movements of the ratio U , we which includes join time, leave time, uplink bytes, download classify the capacity trend of each channel into the following bytes, and channel name associated with the session. We four categories. study the sessions generated on three major TV channels from
  • 7. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 7 TABLE I TABLE III S ESSION DATABASE (6/1/2008–6/30/2008). AVERAGE SHARING RATIO . Channel # sessions # distinct users Average sharing ratio Channel ARD 2,102,638 298,601 Off-peak Peak Cuatro 1,845,843 268,522 ARD 0.335 0.313 SF2 1,425,285 157,639 Cuatro 0.242 0.224 SF2 0.277 0.222 TABLE II F EEDBACK LOGS (6/20/2008–6/29/2008). Channel # feedback logs # sessions sharing ratio is defined as total users’ uplink rate divided ARD 871 1,253 by their download rate on the channel. A sharing ratio of Cuatro 2,922 4,568 one means users contribute to other peers in the network as SF2 656 1,140 much traffic as they download at the time. We calculate the average sharing ratio from the total download/uplink bytes of the collected sessions. We first obtain all the sessions three different countries (Germany, Spain, and Switzerland), which are active across time i. We call a set of such from June 1st to June 30th, 2008. Throughout the paper, we sessions Si . Then assuming uplink/download bytes of each label those channels from Germany, Spain, and Switzerland as session are spread uniformly throughout the entire session ARD, Cuatro, and SF2, respectively. Euro 2008 games were duration, we approximate the average sharing ratio at time uplink bytes(i)/duration(i) held during this period, and those three channels broadcast a i as i∈Si . majority of the Euro 2008 games including the final match. download bytes(i)/duration(i) i∈Si See Table I for information about the collected session data Fig. 5 shows the overlay size (i.e., number of concurrent sets. users) and average sharing ratio super-imposed across the The “user-feedback” server, on the other hand, collects month of June, 2008. According to the figure, the overlay users’ error logs submitted asynchronously by users. The “user size grew to more than 20,000 (e.g., 20,059 on ARD on 6/18 feedback” data here is different from peer’s quality feedback and 22,152 on Cuatro on 6/9). As opposed to the overlay used in PDM reconfiguration described in Section II-C. Zat- size, the average sharing ratio tends to stay flatter throughout too player maintains an encrypted log file which contains, the month. Occasional spikes in the sharing ratio all occurred for debugging purposes, detailed behavior of client-side P2P during 2AM to 6AM (GMT) when the channel usage is very engine, as well as history of all the streaming sessions initiated low, and therefore may be considered statistically insignificant. by a user since the player startup. When users encounter By segmenting a 24-hour day into two time periods, e.g., any error while using the player, such as log-in error, join off-peak hours (0AM-6PM) and peak hours (6PM-0AM), failure, bad quality streaming etc., they can choose to report Table III shows the average sharing ratio in the two time the error by clicking a “Submit Feedback” button on the periods separately. Zattoo’s usage during peak hours typically player, which causes the Zattoo player to send the generated accounts for about 50% to 70% of the total usage of the log file to the user-feedback server. Since a given feedback day. According to the table, the average sharing ratio during log not only reports on a particular error, but also describes peak hours is slightly lower than, but not very much different “normal” sessions generated prior to the occurrence of the from during off-peak hours. Cuatro channel in Spain exhibits error, we can study user’s viewing experience (e.g., channel relatively lower sharing ratio than the two other channels. One join delay) from the feedback logs. Table II describes the bandwidth test site [6] reports that average uplink bandwidth in feedback logs collected from June 20th to June 29th. A given Spain is about 205 kbps, which is much lower than in Germany feedback log can contain multiple sessions (for the same or (582 kbps) and Switzerland (787 kbps). The lower sharing different channels), depending on user’s viewing behavior. ratio on the Spanish channel may reflect regional difference The second column in the table represents the number of in residential access network provisioning. The balance of the feedback logs which contain at least one session generated required bandwidth is provided by Zattoo’s Encoding Server on the channel listed in the corresponding entry in the first and Repeater nodes. column. The numbers in the third column indicate how many distinct sessions generated on said channel are present in the B. Channel Switching Delay feedback logs. When user clicks on a new channel button, it takes some time (a.k.a. channel switching delay) for the user to be A. Overlay Size and Sharing Ratio able to start watching streamed video on Zattoo player. The We first study how many concurrent users are supported by channel switching delay has two components. First, Zattoo the Zattoo system, and how much bandwidth is contributed by player needs to contact other available peers and retrieve all them. For this purpose, we use the session database presented required sub-streams from them. We call the delay incurred in Table I. By using the join/leave timestamps of the collected during this stage “join delay.” Once all necessary sub-streams sessions, we calculate the number of concurrent users on a have been negotiated successfully, the player then needs to given channel at time i. Then we calculate the average sharing wait and buffer the minimum amount of streams (e.g., 3 ratio of the given channel at the same time. The average seconds) before starting to show the video to the user. We call
  • 8. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 8 25000 2 25000 2 25000 2 Overlay Size Overlay Size Overlay Size Sharing Ratio 1.8 Sharing Ratio 1.8 Sharing Ratio 1.8 20000 1.6 20000 1.6 20000 1.6 1.4 1.4 1.4 Sharing Ratio Sharing Ratio Sharing Ratio Overlay Size Overlay Size Overlay Size 15000 1.2 15000 1.2 15000 1.2 1 1 1 10000 0.8 10000 0.8 10000 0.8 0.6 0.6 0.6 5000 0.4 5000 0.4 5000 0.4 0.2 0.2 0.2 0 0 0 0 0 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Day in 2008/6 Day in 2008/6 Day in 2008/6 (a) ARD (b) Cuatro (c) SF2 Fig. 5. Overlay size and sharing ratio. 1 1 1 Feedback Logs Feedback Logs Feedback Logs Session Database Session Database Session Database 0.8 0.8 0.8 0.6 0.6 0.6 CDF CDF CDF 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24 Arrival Hour Arrival Hour Arrival Hour (a) ARD (b) Cuatro (c) SF2 Fig. 6. CDF of user arrival time. TABLE IV the resulting wait time “buffering delay.” The total channel M EDIAN CHANNEL JOIN DELAY. switching delay experienced by users is thus the sum of join Median join delay Maximum overlay size Channel delay and buffering delay. PPLive reports channel switching Off-peak Peak Off-peak Peak delay around 20 to 30 seconds, but can be as high as 2 minutes, ARD 2.29 sec 1.96 sec 2,313 19,223 of which join delay typically accounts for 10 to 15 seconds [7]. Cuatro 3.67 sec 4.48 sec 2,357 8,073 SF2 2.49 sec 2.67 sec 1,126 11,360 We measure the join delay experienced by Zattoo users from the feedback logs described in Table II. Debugging informa- tion contained in the feedback logs tells us when user clicked hours, which indicates that feedback submission rate during on a particular channel, and when the player has successfully off-peak hours on SF2 was relatively lower than normal. Later joined the P2P overlay and starting to buffer content. One during peak hours, however, feedback submission rate picks concern in relying on user-submitted feedback logs to infer up as expected, closely matching the actual user arrival rate. join delay is the potential sampling bias associated with them. Based on this observation, we argue that feedback logs can Users typically submit feedback logs when they encounter serve as representative samples of daily user activities. some kind of errors, and that brings up the question of whether Fig. 7 shows the CDF distributions of channel join delay the captured sessions are representative samples to study. for ARD, Cuatro and SF2 channels. We show the distributions We attempt to address this concern by comparing the data for off-peak hours (0AM-6PM) and peak hours (6PM-0AM) from feedback logs against those from the session database. separately. Median channel join delay is also presented in a The latter captures the complete picture of user’s channel similar fashion in Table IV. According to the CDF distribu- watching behavior, and therefore can serve as a reference. In tions, 80% of users experience less than 4 to 8 seconds of our analysis, we compare the user arrival time distribution join delay, and 50% of users even less than 2 seconds of join obtained from the two data sets. For fair comparison, we used delay. Also, Table IV shows that even a 10-fold increase on a subset of the session database which was generated during the number of concurrent users during peak hours does not the same period when the feedback logs were collected (i.e., unduly lengthen the channel join delay (up to 22% increase from June 20th to 29th). in median join delay). Fig. 6 plots the CDF distribution of user arrivals per hour obtained from feedback logs and session database separately. C. Repeater Node Assignment The steep slope of the distributions during hour 18-20 (6- 8PM) indicates the high frequency of user arrivals during As illustrated in Fig. 5, the live coverage of Euro Cup games those hours. On ARD and Cuatro, the user arrival distributions brought huge flash crowds to the Zattoo system. With typical inferred from feedback logs are almost identical to those from users contributing only about 25–30% of the average streaming session database. On the other hand, on SF2, the distribution bitrate, Zattoo must subsidize the balance. As described in obtained from feedback logs tends to grow slowly during early Section III, the Zattoo’s Subsidy System assigns Repeater
  • 9. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 9 1 1 1 0.9 0.9 0.9 0.8 0.8 0.8 0.7 0.7 0.7 0.6 0.6 0.6 CDF CDF CDF 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 Off-Peak Hours 0.1 Off-Peak Hours 0.1 Off-Peak Hours Peak Hours Peak Hours Peak Hours 0 0 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Channel Join Delay (Sec) Channel Join Delay (Sec) Channel Join Delay (Sec) (a) ARD (b) Cuatro (c) SF2 Fig. 7. CDF of channel join delay. 200 200 200 25000 Overlay Size 25000 Overlay Size 25000 Overlay Size 180 180 180 Number of Repeaters Assigned Number of Repeaters Assigned Number of Repeaters Assigned Number of Repeaters Number of Repeaters Number of Repeaters 160 160 160 20000 20000 20000 140 140 140 Overlay Size Overlay Size Overlay Size 120 120 120 15000 15000 15000 100 100 100 80 80 80 10000 10000 10000 60 60 60 5000 40 5000 40 5000 40 20 20 20 0 0 0 0 0 0 18 19 20 21 22 23 24 12 14 16 18 20 22 24 18 19 20 21 22 23 24 Hour of Day Hour of Day Hour of Day (a) ARD (b) Cuatro (c) SF2 Fig. 8. Overlay size and channel provisioning. nodes to channels that require more bandwidth than its own As described in Section II-A, Zattoo’s peer discovery is globally available aggregate upload bandwidth. guided by peer’s topology information. To minimize potential Fig. 8 shows the Subsidy System assigning more Repeater sampling bias caused by our use of single vantage point for nodes to a channel as flash crowds arrived during each Euro monitoring, we assigned “empty” AS number and country Cup game and then gradually reclaiming them as the flash code to our monitoring clients, so that their probing is not crowd departed. For each of the channel reported, we choose geared towards those peers located in the same AS and a particular date with the biggest flash crowd on the channel. country. The dip in overlay sizes on ARD and Cuatro channels occurred during the half-time break of the games. The Subsidy Server was less aggressive in assigning Repeater nodes to the ARD A. Sub-Stream Synchrony channel, as compared to the other two channels, because the To ensure good viewing quality, peer should not only Repeater nodes in the vicinity of the ARD server have higher obtain all necessary sub-streams (discounting redundant sub- capacity than those near the other two. streams), but also have those sub-streams delivered temporally synchronized with each other for proper online decoding. Re- ceiving out-of-sync sub-streams typically results in pixelated V. C LIENT- SIDE M EASUREMENTS screen on the player. As described in Sections 1 and II-C, To further study the P2P overlay beyond details obtain- Zattoo’s protocol favors sub-streams that are relatively in- able from aggregated session-level statistics, we run several sync when constructing the PDM, and continually monitors modified Zattoo clients which periodically retrieve the internal the sub-streams’ progression over time, replacing those sub- states of other participating peers in the network by exchang- streams that have fallen behind and reconfiguring the PDM ing SEARCH/JOIN messages with them. After a given probe when necessary. In this section we measure the effectiveness session is over, the monitoring client archives a log file where of Zattoo’s Adaptive PDM in selecting sub-streams that are we can analyze control/data traffic exchanged and detailed largely in-sync. protocol behavior. We run the experiment during Zattoo’s live To quantify such inter-sub stream synchrony, we measure coverage of Euro 2008 (June 7th to 29th). The monitoring the difference in the latest (i.e., maximum) packet sequence clients tuned to game channels from one of Zattoo’s data numbers belonging to different incoming sub-streams. When a centers located in Switzerland while the games were broadcast remote peer responds to a SEARCH query message, it includes live. The data sets presented in this paper were collected in its SEARCH reply the latest sequence numbers that it has during the coverage of the championship final on two separate received for all sub-streams. If some sub-streams happen to be channels: ARD in Germany and Cuatro in Spain. Soccer teams lossy or stalled at that time, the peer marks such sub-streams from Germany and Spain participated in the championship in its SEARCH replies. Thus, we can inspect SEARCH replies final. from existing peers to study their inter-sub stream synchrony.
  • 10. IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011 10 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 Euro 2008 final on ARD 0.1 Euro 2008 final on ARD Euro 2008 final on Cuatro Euro 2008 final on Cuatro 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0 100 200 300 400 500 600 700 800 900 1000 Number of Bad Sub-streams Sub-stream Synchrony (# of Packets) (a) CDF for number of bad sub-streams (b) CDF for sub-stream synchrony Fig. 9. Sub-stream synchrony. In our experiment, we collected SEARCH replies from 1 Euro 2008 final on Cuatro 4,420 and 6,530 distinct peers from ARD and Cuatro channels 0.9 Euro 2008 final on ARD respectively, during the 2-hour coverage of the final game. 0.8 From the collected SEARCH replies, we check how many 0.7 sub-streams (out of 16) are “bad” (e.g., lossy or missing) for 0.6 each peer. Fig. 9(a) shows the CDF distribution of the number CDF 0.5 of bad sub-streams. According to the figure, about 99% (ARD) 0.4 and 96% (Cuatro) of peers have 3 or less bad sub-streams. 0.3 0.2 Current Zattoo deployment dedicates k = 3 sub-streams 0.1 (out of n = 16) for loss recovery purposes. That is, given a 0 segment of 16 consecutive packets, if peer has received at least -350 -300 -250 -200 -150 -100 -50 0 50 100 13 packets, it can reconstruct the remaining 3 packets from Relative Peer Synchrony (# of Packets) the RS error correcting code (see Section 1). Thus if peer can Fig. 10. Peer synchrony. receive any 13 sub-streams out of 16 reliably, it can decode the full stream properly. The result in Fig. 9 (a) suggests that B. Peer Synchrony the number of bad sub-streams is low enough as to not cause While sub-stream synchrony tells us stream quality different quality issues in the Zattoo network. peers may experience, “peer synchrony” tells us how varied in time peers’ viewing points are. With small scale P2P networks, After discounting “bad” sub-streams, we then look at the all participating peers are likely to watch live streaming synchrony of the remaining “good” sub-streams in each peer. roughly synchronized in time. However, as the size of the Fig. 9(b) shows the CDF distribution of the sub-stream syn- P2P overlay grows, the viewing point of edge nodes may be chrony in the two channels. Sub-stream synchrony of a given delayed significantly compared to those closer to the Encoding peer is defined as the difference between maximum and mini- Server. In the experiment, we define the viewing point of a mum packet sequence numbers among all sub-streams, which peer as the median of the latest sequence numbers across is obtained from the peer’s SEARCH reply. For example, if its sub-streams. Then we choose one peer (e.g., a Repeater some peer has sub-stream synchrony measured at 100, it means node directly connected to the Encoding Server) as a reference that the peer has one sub-stream that is ahead of another sub- point, and compare other peers’ viewing point against the stream by 100 packets. If all the packets are received in order, reference viewing point. the sub-stream synchrony of a peer measures at most n − 1. Fig. 10 shows the CDFs of relative peer synchrony. The If we received multiple SEARCH replies from the same peer, relative peer synchrony of peer X is obtained by the viewing we average the sub-stream synchrony across all the replies. point of X subtracted by the reference viewing point. So peer Given the 500 kbps average channel data rate, 60 consecutive synchrony at -60 means that a given peer’s viewing point packets roughly correspond to 1-second worth of streaming. is delayed by 60 packets (roughly 1 second for a 500 kbps Thus, the figure shows that on Cuatro channel, 20% of peers stream) compared to the reference viewing point. A positive have their sub-streams completely in-sync, while more than viewing point means that a given peer’s stream gets ahead 90% have their sub-streams lagging each other by at most of the reference point, which could happen for peers which 5 seconds; on ARD channel, 30% are in-sync, and more than receive streams directly from the Encoding Server. The figure 90% are within 1.5 seconds. The buffer space of Zattoo player shows that about 1% of peers on ARD and 4% of peers on has been dimensioned sufficiently to accommodate such low Cuatro experienced more than 3 seconds (i.e., 180 packets) degree of out-of-sync sub-streams. delay in streaming compared to the reference viewing point.