Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. -1- Question(s): 17 Meeting, date: Luleå, march 2010 Study Group: 12 Working Party: 3 Intended type of document (R-C-TD): TD Source: France Telecom Title: Play-out buffer events as performance parameters Contact: Jean-Raymond Louvion Tel: +33 2 9605 3707 Orange Labs Fax: +33 2 9605 1252 France Email: jeanraymond.louvion@orange- Contact: Pierre Boyer Tel: +33 2 9605 2239 Orange Labs Fax: +33 2 9605 1252 France Email: pierre.boyer@orange- Please don’t change the structure of this table, just insert the necessary information. ABSTRACT This contribution addresses the impact of jitter-inducing networks (ATM, IP or Ethernet) on the behaviour of real-time applications (VoIP, IPTV, VoD) implementing de-jittering (playout) buffers in the reception side. It proposes to complement the delay-related performance parameters already defined in Y.1540 and in IETF with new ones based on the events related to playout buffer behaviour. 1. Introduction ITU-T recommendation Y.1540 defines an end-to-end 2-point IP packet delay variation. End-to-end 2-point IP packet delay variation (PDV) is defined based on the observations of corresponding IP packet arrivals at ingress and egress MP (e.g., MPDST, MPSRC). These observations characterize the variability in the pattern of IP packet arrival events at the egress MP and the pattern of corresponding events at the ingress MP with respect to a reference delay. The delay variation of an individual packet is naturally defined as the difference between the actual delay experienced by that packet and a nominal (or reference) delay. The preferred reference (used in Y.1541 IPDV objectives) is the minimum delay of the population of interest. This ensures that all variations will be reported as positive values, and this simplifies reporting the range of variation (the maximum value of variation is equal to the range). The preferred method (used in Y.1541 objectives) for summarising the delay variation of a population of interest is to select upper and lower quantiles of the delay variation distribution and then measure the distance between those quantiles. For example, select the 1-10-3 quantile and the 0 quantile (or minimum), make measurements, and observe the difference between the delay variation values at these two quantiles. This example would help application designers determine the de-jitter buffer size for no more than 0.1% total buffer over-flow. This parameter is referred to as the PDV (Packet Delay Variation) and is a 2-point metric. In addition to this definition, IETF provides another definition: Inter-Packet Delay Variation, IPDV, where the reference is the previous packet in the stream (according to sending sequence), and the reference changes for each packet in the stream. This form was called Instantaneous Packet Delay Variation in early IETF contributions, and is similar to the packet spacing difference metric used for interarrival jitter calculations in [RFC3550].
  2. 2. -2- Although it relies on the observation of the arrival times of consecutive packets, IPDV is a 2-point metric since it is derived from delays between a source and a destination. RFC 5481 makes a very complete and exhaustive comparison of these two definitions in their capability of being used for: •Inferring queue occupation on a path, •Determining de-jitter buffer size, •Composing values obtained on different sub-paths in order to derive values for the entire path, •Designing application-layer FEC. As said before both IPDV and PDV are 2-point metrics, i.e. they are obtained from the comparison of delays between two measurement points (e.g. a source and a destination). These measurement points should accurately be time-related through clock synchronization. Several options are available (GPS, CDMA or NTP), which provide a relative accuracy of the order of 1 ms. Clock synchronization may be inconvenient or subject to appreciable errors. Round-trip measurements may give a cumulative indication of the delay variation present on both directions of the path. But this solution is not satisfying because delay distributions are rarely symmetrical, so it is difficult to infer much about the one-way-delay variation from round-trip measurements. This contribution proposes and describes new metrics based on events related to playout buffer behaviour (dry-out and overflow). These metrics are 1-point metrics which solve the clock synchronisation problem described above. In addition these metrics are relevant for real-time applications (VoIP, IPTV, VoD) needs, since they mimic the behaviour of the playout buffers used in these applications. 2. Dimensioning Playout Buffers (PoBs) Real-time sources generate streams of audio signal or video images that are transmitted over packet- based networks as IP packets or Ethernet frames. In order to smooth out the natural effect of asynchronous networks (the so-called delay variation or jitter), the arriving packets are temporary stored in a PoB, i.e. a device that is designed to counter the jitter that is introduced by the network, until the moment the audio signal or the image needs to be delivered to the decoding scheme. In order to ensure a continuous playout of streaming audio or video, it is important to tune the PoB parameters such that, at the moment an audio or an image is to be played out, all packets of that audio or image reside in the buffer. The implementation of a playout scheme for streaming media over a packetized network (IP or Ethernet) usually involves delaying the first packet of the stream at the PoB in the subscriber’s STB over a sufficiently long period of time - the build-up time - so that the majority of the packet delays incurred in the network can be absorbed. Adaptive playout strategies, such as adaptation of the playout rate when the buffer is almost empty in an effort to avoid PoB starvation, are often included as well. Packet loss at the PoB originates from two different types of events: on the one hand there may be overflow, which is due to having a full PoB upon arrival of a new packet, and on the other hand a packet will be lost because of underflow when it arrives in the PoB after its designated playout time. B. Steyaert and all has studied and modelled the probabilities of these events, which should not exceed some target levels. Dimensioning the PoB amounts to expressing the required values of the build-up time and the buffer size in terms of these target levels.
  3. 3. -3- In the ATM context, a similar work had been done based of the 2-pt CDV (Cell Delay Variation) which had been used in order to dimension the build-up time. It had been shown that the build-up time could be estimated by a given quantile of the 2-pt CDV distribution. 1-pt delay variation may also be used in order to dimension the buffer size in case of applications which send a periodic flow (which is the case of most real-time applications). Even for variable bitrate applications the output rate of the buffer should be adapted from the quantity of data arriving in the buffer (2 thresholds, EWMA,…). With PDV, it is difficult to analyse the effects of delay variation on PoB behaviour and to take into account network discrepancies (loss, reordering, path changes,…). 3. New metrics: PoB events In terms of performance characterization, a Playout Buffer may be represented by 3 parameters: its buffer size, the initial delay and the service rate. The initial delay δ, also called build-up time, is the delay introduced by the buffer to play out the first packet, in order to minimize the risk that the buffer would dry out. The service rate T may be known by the destination application but it is usually not known inside the network. Therefore this metric may be applied to packet flows containing a service clock. Well- known examples are RTP flows and MPEG video flows. 3.1. IP packets containing a clock In RTP flows, IP packets contain a 4-bytes timestamp which is used to enable the receiver to play back the received samples at appropriate intervals. In addition this feature generalizes the use of this protocol to variable bitrate flows. MPEG2-TS flows are organized in streams related to the different components of the audio-visual application (video, audio,…). Each stream gives birth to 188-bytes long PES (Packet Elementary Stream) packets. These PES packets are then assembled by groups of 7 in 1316-bytes long IP packets (1316 = 7 * 188). PES packets may contain a Program Clock Reference (PCR) enabling the decoder to present synchronized content, such as audio tracks matching the associated video. Usually the PCR is imbedded in the video PES packets. 3.2. PoB events H.222.0 (Information technology – Generic coding of moving pictures and associated audio information: Systems) contain a lot of material which can be useful in defining the PoB behavior. In particular, Annex D/H.222.0 (Systems timing model and application implications of this Recommendation) gives some suggestions for implementing decoder systems to suit some typical applications. It makes use of the clock reference timestamps, which are samples of the system time clock, and applicable both to a decoder and to an encoder. They have a resolution of one part in 27 000 000 per second. As such, they can be utilized to implement clock reconstruction control loops in decoders with sufficient accuracy for all identified applications. In practice a decoder's free-running system clock frequency will not match the encoder's system clock frequency which is sampled and indicated in the PCR values. The decoder's system time clock can be made to slave its timing to the encoder using the received PCRs. The prototypical method of slaving the decoder's clock to the received data stream is via a phase-locked loop (PLL). This may be used in order to derive a generic PoB behaviour.
  4. 4. -4- Annex J/H.222.0 (Interfacing jitter-inducing networks to MPEG-2 decoders) provides guidance and insight to entities concerned with sending system streams over jitter-inducing networks. Annex Q/H.222.0 (T-STD and P-STD buffer models for ISO/IEC 13818-7 ADTS) defines a buffer model for audio streams which may be used for modeling PoBs. The PoB may be modeled in order to determine the occurrence of a degraded situation. Indeed in PoBs behavior two undesirable situations may happen: the FIFO may dry out or the FIFO may overflow. Both situations lead to QoS degradations. In such a model, for each IP packet, the clock reference PCR may be extracted from one of the PES packets contained in the IP packet (in case of MPEG2-TS) or the clock reference may be extracted from the RTP packet embedded in the IP packet. 4. Operational use of PoB events 4.1. Where these measurements should be performed? These measurements could be performed at the source, i.e. at the output of a network head end, just before entering the IP network before and/or after FEC capabilities (if any). Indeed an operator should be in a position to check flows coming from service providers. They could also be performed at network outputs (at DSLAMs or at routers connecting DSLAMs). Indeed an operator should be in a position to check flows coming out of its own network and verify that it conforms QoS expectations. They could also be performed at INIs (Inter Network Interfaces), i.e. at interfaces where different network operators interconnect. Indeed each operator should be in a position to verify that a flow coming from another operator is conforming to QoS expectations. 4.2. Statistics Different statistics may be associated to PoB events, for example: •The number of PCR packets finding a saturated FIFO •The time when the first FIFO saturation is observed •The number of PCR packets finding an empty •The time when the first FIFO dry-out is observed Other statistics are being defined in the Broadband Forum. 4.3. Examples The following measurements have been obtained with a probe, named Amelie, designed by Orange Labs in Lannion in 2005. Amelie is a 2/4-port passive probe measuring the traffic passing on a 10/100/1000 Base-TX Ethernet copper or optical interface and it complies with the 802.1Q (a.k.a. "VLAN tagging") standard. Amelie can be connected directly to a network equipment port mirroring an ad hoc selection of frames to be analyzed; alternatively, Amelie can receive the whole traffic carried by a network link via an optical splitter —this latter option being preferred for traffic rate and jitter measurements. Amelie is hosted by a 64-bit workstation supporting dual processors. The operating system is based on a v2.6 LINUX kernel. Amelie basically runs an "on-the-fly" time-stamping process of every incoming Ethernet frame. Amelie emulates the behaviour of the decoder in the Set Top Box (STB) when it smoothes jitter out
  5. 5. -5- of the TV digital flow. Figure 1 below shows the waiting time empirical distribution of the MPEG2/4-TS packets in the end FIFO. Furthermore, Amelie studies the occurrence of FIFO underflow and overflow by processing the program clock successive references (PCR) imbedded in the video elementary stream. It gives the time of first passage into the underflow and overflow states and subsequent inter-arrival times of such unexpected events. End FIFO underflow and overflow induce image freezing and artefacts experienced by the client and therefore must be accounted against the network performance. The end FIFO behaviour assumes that values have been set for a build-up time and an MPEG packet delay variation tolerance. Default values are 3 ms for both. The digital TV flow bit rate is derived from the program clock references; thus, it can be either constant or variable and does not need to be specified. Program clock references are computed by the encoder in a recursion loop involving successively every preceding packet. As seen above, Amelie has no reliable information on which packets have been lost in the network so that it cannot alter the program clock references to take loss into account. Therefore, the queuing process in the end FIFO is only due to MPEG packet propagation delay variations throughout the network. Figure 1 shows an end FIFO waiting time empirical distribution for a SD TV digital flow. It can be seen that the end FIFO waiting time empirical distribution spreads over [0, 6 ms] which is the full range of possible values set by the operator.
  6. 6. -6- Figure 1 - End FIFO waiting time empirical distribution for a SD TV digital flow In this measurement, 2 728 731 PCR packets have been observed. Less than 1% (0.78%) of PCR packets experience a waiting time in the FIFO lower than 537.9 µs and less than 1% (0.78%) of PCR packets experience a waiting time in the FIFO greater than 5 828 µs. No program clock discontinuity has been observed. The number of PCR packets finding a saturated FIFO is equal to 961. The time when the first FIFO saturation is observed is equal to 17 442 s. The number of PCR packets finding an empty FIFO is equal to 2. The time when the first FIFO dry- out is observed is equal to 55 981 s. In addition to these measurements, it may be interesting to compare the discrepancies between the Delay Factor and the end FIFO waiting time respective densities. Indeed, the Delay Factor does not take into account the behaviour of the specific algorithm (PLL: Phase-Locked Loop) which is implemented in the decoder to slave its timing to the encoder. As opposed to that, the emulation of the end FIFO addresses the case when the operational limits of this algorithm are met in the Set Top Box due to clock inaccuracy or network shortage of bandwidth. 5. References ITU-T recommendation I.356 (2000), B-ISDN ATM layer cell transfer performance. ITU-T recommendation Y.1540 (2007), Internet protocol data communication service – IP packet transfer and availability performance parameters. ITU-T recommendation H.222.0 (2006), Information technology – Generic coding of moving pictures and associated audio information: Systems. IETF RFC 5481 (March 2009), A. Morton, B. Claise, Packet delay variation applicability statement. B. Steyaert, K. Laevens, D. De Vleeschauwer, H. Bruneel, Analysis and design of a playout buffer for VBR streaming video, Ann Oper Res (2008) n° 162.