VoIP in 3G Networks: An End-to-End Quality of Service Analysis
Upcoming SlideShare
Loading in...5
×
 

VoIP in 3G Networks: An End-to-End Quality of Service Analysis

on

  • 625 views

 

Statistics

Views

Total Views
625
Views on SlideShare
625
Embed Views
0

Actions

Likes
0
Downloads
16
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

VoIP in 3G Networks: An End-to-End Quality of Service Analysis VoIP in 3G Networks: An End-to-End Quality of Service Analysis Document Transcript

  • VoIP in 3G Networks: An End-to- End Quality of Service Analysis Renaud Cuny1, Ari Lakaniemi2 1 2 Nokia Networks Nokia Research Center P.O.Box 301, 00045 Nokia Group, Finland P.O.Box 407, 00045 Nokia Group, Finland renaud.cuny@nokia.com ari.lakaniemi@nokia.com Abstract-- This paper presents the results of a Quality of interactive communication sessions between users [1]. Such Service (QoS) study for VoIP service over 3G WCDMA sessions can include voice, but also e.g. video, chat, interactive networks. An end-to-end simulation platform has been used for games, and virtual reality. Finally, the convergence towards this purpose. The simulations have been run using Adaptive packet switched and IP technology may convince mobile Multi-Rate (AMR) speech codec at 12.2 kbit/s with combination operators to go for solutions that are truly all-IP in order to of RTP, UDP and IPv6 protocols. The simulated transmission simplify network interconnection and network management. path includes two radio links (uplink and downlink), connected Naturally, for wide end user acceptance and deployment, the with a packet switched core network and UTRAN Radio Access VoIP service is required to provide similar perceived voice Networks with several different radio transmission conditions. quality as provided by current highly optimized GSM Furthermore, RObust Header Compression is applied in both networks. The challenges for achieving this include typical radio links. The results include buffering statistics, end-to-end VoIP related QoS problems, such as packet loss, delay, and delay estimates, and packet loss statistics. delay variation (i.e. jitter), as well as additional overhead brought by the VoIP protocol stack. Therefore the end-to-end I. INTRODUCTION VoIP QoS should be studied and evaluated carefully. As an During the last few years, the voice over data network example, it is likely that the packet switched technology, services have gained increased popularity. Quick growth of the although managed by e.g. Differentiated Services [2], will Internet Protocol (IP) based networks, especially the Internet, generate more delay and jitter than the circuit switched has directed a lot of interest towards Voice over IP (VoIP). technology. Further additional delay and jitter may be caused The VoIP technology has been used in some cases, to replace by the packet segmentation in the radio interface. The end-to- traditional long-distance telephone technology, for reduced end delay is likely to be close to the maximum delay still costs for the end-user. Naturally to make VoIP infrastructure providing acceptable conversational quality (around 250- and services commercially viable, the Quality of Service 300ms [3]), extra attention needs to be paid to jitter: too much (QoS) needs to be at least close to the one provided by the jitter for a voice stream may be problematic since basic jitter Public Switched Telephone Network (PSTN). On the other compensation methods may not apply very well or have hand, VoIP associated technology will bring to the end user limited effects. So one important issue to investigate is value added services that are currently not available in PSTN. whether the jitter in 3G networks, will have negative impact on On the other front, the current development in the cellular the end-user perceived voice quality. radio network technologies are paving the way towards IP This paper is organized as follows. Section II presents in capable radio networks. The so called Third Generation (3G) details the end-to-end VoIP simulator used for this study. Each cellular networks, developed and standardized by the Third component of the tool is described in detail. Section III Generation Partnership Project (3GPP), will provide IP over presents the simulation results focusing on packet loss ratio wireless services, enabling therefore also VoIP. In current and end-to-end delays. Finally, the conclusion in section IV cellular systems, e.g. in GSM, the telephony service is based summarizes the main finding of this study and points out the on circuit switched approach. This service is currently highly areas that could be investigated further. optimized for transmission of voice, thereby providing good speech quality and good spectral efficiency. However, carrying II. END-TO-END VOIP SIMULATION VoIP will be also possible in 3G WCDMA networks, e.g. Protocols used by the VoIP over 3G can be roughly divided 3GPP release 5, and may be of special interest for the mobile into two categories: signaling related protocols and media network operators for multiple reasons: Firstly, as the related protocols. Although the signalling protocols, such as bandwidth for individual flows in packet switched domain is SIP, are very important part of a VoIP system, in this study we not reserved in advance, the multiplexing effects should bring concentrate only on media related protocols and transmission significant capacity savings. Secondly, VoIP service will be of media data. supported by the Session Initiation Protocol (SIP), which is a text-based protocol, similar to HTTP and SMTP, for initiating
  • To run the end-to-end simulations we developed a VoIP RObust Header Compression (ROHC) protocol [4] has been speech simulator application for modeling the telephony developed to tackle this problem. ROHC provides link-based application and protocol layers from application down to IP compression of IP/UDP/RTP headers, in best case down to 1 and PDCP. The lower layers required for radio link and core byte. The effective compression makes use of the fact that network modelling were simulated using external simulation majority of the fields in the combined IP/UDP/RTP header tools and the resulting network conditions were applied in the either remain constant or introduce constant change throughout VoIP speech simulator using error pattern files. The different a session. However, the maximum compression mentioned components of the simulation chain are described in detail in above can only be reached when imposing some limitations, a the following subsections. more typical compressed header size would be three or four bytes. The ROHC operation is based on synchronized A. Speech application compression (at the sender site) and decompression (at the On application level we assumed usage of Adaptive Multi- receiver site) contexts. The decompression context is Rate (AMR) speech codec, which is a mandatory codec for initialised by transmitting full IP/UDP/RTP headers in the conversational speech services within 3G systems. For all beginning of the session. Also irregularities in the transmitted simulation runs we selected usage of AMR 12.2 kbit/s mode stream e.g. by DTX operation or lost packet can introduce with DTX functionality enabled, and employed bandwidth compressed headers slightly larger than in the optimal state. In efficient mode of the AMR RTP payload format. This implies error prone transmission conditions a feedback mechanism is that during talk spurts the source generates 32-byte speech important part of robust compression operation, enabling payload at 20 ms intervals, while due to DTX during silence recovery in case the synchronization between compressor and periods we will have 7-byte payload carrying Silence de-compressor is lost. The ROHC protocol was implemented Descriptor (SID) frame at 160 ms intervals. in our simulator. We further assumed the typical VoIP protocol stack The ROCH in R-MODE is assumed on both radio links, employing Real-Time Transport Protocol (RTP) encapsulated providing feedback mechanism to enable safe convergence to in User Datagram Protocol (UDP), which is further carried by optimal compression state. We also assume that ROHC the IP. The combination of these protocols introduces total of Context Identifier is transmitted as a part of the compressed 40 bytes header data when using IP version 4 (IPv4), and bytes packet. These settings imply that the minimum size of a header when using IP version 6 (IPv6). We selected IPv6, compressed IP/UDP/RTP header is four bytes. which has two implications: the size of an IP packet carrying C. Radio network modeling one AMR frame will be either 92 bytes (speech) or 67 bytes (SID), and we need to enable UDP checksum because the IPv6 The model for the radio network included the actual radio header does not include a checksum of its own but the most link, processing in layers below PDCP and access transport in critical fields of the header are covered as part of the UDP UTRAN. The radio link error patterns were prepared using a pseudo header. separate WCDMA system simulator. Three different radio conditions were investigated, introducing frame error rates Protocol layers below IP follow the 3GPP release 5 (FER) of 1%, 3% and 5%. Additionally we also included specifications, as illustrated in Figure 1. error-free case in the set of simulation conditions. Different Application error patterns were prepared for both uplink (UL) and E.g., IP , E.g., IP, downlink (DL), and the error patterns were obtained from a PPP PPP traced terminal that was moving along a predefined route. Relay Relay For UL radio network we assumed processing and transport PDCP PDCP GTP-U GTP-U GTP-U GTP-U delay of 36 ms, and for DL radio network the corresponding RLC RLC UDP/IP UDP/IP UDP/IP UDP/IP delay is 49 ms. Note that 36+49=85 ms is the lower limit for MAC MAC L2 L2 L2 L2 the time before the ROHC compressor can receive a feedback L1 L1 L1 L1 L1 L1 message from the decompressor regarding a specific packet. Uu Iu-PS Gn Gi MS UTRAN 3G-SGSN 3G-GGSN This delay is significant in such a way that in beginning of a stream the ROHC decompressor context needs to be initialized by sending full headers, which will be sent until a feedback Figure 1 3GPP Protocol stack message indicating successful decompressor context initialization is received. A similar situation can occur also if B. Robust Header Compression (ROHC) the decompression context gets corrupted for some reason, e.g. When operating in the bandwidth limited 3G networks it is hard handover or excessive amount of transmission errors. important to use the radio band as effectively as possible, and However, for this work we assumed that no ROHC header overhead up to 60 bytes can seriously degrade the decompressor context re-initialization is required during a spectral efficiency of a VoIP service over such link. The session.
  • D. RLC, MAC and PDCP layers packets is needed to ensure continuous data flow between asynchronous input and synchronous output. In VoIP this kind The WCDMA unacknowledged radio mode is the natural of jitter buffering plays an important part in the overall speech choice for transmitting the VoIP packets over the radio link. quality. The basic approach to jitter buffering is to wait for a This mode provides possibility for segmentation and padding predetermined time after the reception of the first packet of IP packets into radio Time Transfer Intervals (TTIs) to before playing out the frame carried by this packet. The make best possible usage of allocated radio resources. The purpose of the playout delay is to allow some variation in the radio bearer was configured 16 kbit/s; with TTI length of 20ms arrival times of subsequent packets. Frames arriving after their this enables transmission 40 bytes of user data at 20ms scheduled playout time are discarded and in the speech intervals. decoder point of view they are lost frames. Naturally in this E. Packet switched domain modelling approach the predetermined buffering delay is the most important factor of the buffering performance: too short In the packet switched domain we considered the following buffering delay will risk buffer underflows when packets do delay components: Delay in the IP backbone, delay in the not arrive in time due to jitter, and on the other hand too long gateway elements (SGSN and GGSN) and delay in IuPS buffering time introduces unnecessarily long delay and can interface. Typically, the backbone elements (IP routers) and also introduce buffer overflows. gateways may introduce some jitter to VoIP traffic, depending on the load in the network. However appropriate traffic However, for this study we configured the jitter buffer in prioritisation (e.g. based on Differentiated Services) can limit receiving terminal in such a way that no frames were the queuing delay (and thus potential jitter) to specific values discarded, neither due to late arrival nor due to buffer defined by the operator. overflow. The main reason for this choice was the aim to concentrate on the QoS issues that are dependent on the We modelled this kind of PS domain structure to generate a network. delay distribution file for a stream of 30 000 packets transmitted at 20 ms intervals. The resulting delay distribution When considering VoIP traffic over a wireless 3G network, is illustrated in Figure 2, and it introduces 19 ms average delay it is not sufficient to buffer only in the receiving terminal. with 1.0 ms standard deviation. The minimum and maximum Actually in this environment the most critical link between values for the delay are 12.4 ms and 23.7 ms, respectively. asynchronous input and synchronous output is between the PS core network and the DL radio network. At this point of data path the units we are buffering are IP packets received from the packet switched core network, which will be forwarded to the radio path. Here we assume a slightly different buffering strategy as described above for jitter buffering in the receiving terminal: instead of relying on long enough buffering delay we use FIFO buffer with limited size (as number of packets in the buffer) and specify a maximum time a packet can be stored in a buffer. I.e. if a predetermined number of packets are already stored in the buffer, a new incoming packet will dropped. And if a packet has been waiting in the buffer for longer time than specified by the discard timer, it will be dropped to avoid accumulating delay for subsequent packets. However, to make sure that the large packets required for ROHC initialization will get through without unfeasibly large value for discard timer we made the assumption that a (tail of a) packet that has been already partially transmitted due to segmentation is never dropped even if the timer has elapsed. Figure 2: Delay distribution in PS domain. Although in general it might not seem sensible to perform F. Buffering buffering in the transmitting terminal for a VoIP application, due to strictly limited radio bandwidth, allocated according to Typically an audio playout device in the receiving terminal optimally compressed headers, and ROHC initialisation is synchronized to a local clock signal to make sure that there requiring transmission of full IP/UDP/RTP headers, we need is always signal available for playback. In practice this implies to consider also buffering prior to UL transmission. We apply that a new frame is required regularly at intervals determined similar buffering mechanism as described for DL, i.e. we by the frame rate. On the other hand, due to jitter the packets specify fixed size FIFO buffer with a discard timer to make can arrive at the receiver at irregular rate that is not sure that this bottleneck does not cause unfeasibly long delay. synchronous to the playout. Therefore, the buffering of speech
  • G. Additional simulation settings also that the average network delay includes the jitter buffering time in the receiving terminal. We used the same speech input sequence for all simulation runs. This speech sequence has approximately 6 minutes 30 Table 1: Simulation results. seconds duration and it is an excerpt of a real discussion, and therefore introduces realistic structure of alternating talk spurts Radio PLR PLR on Total Avg.DL Avg. and silence periods. The speech is in Finnish and it is recorded link in DL radio PLR buff network in low-noise office environment. The observed speech activity FER buff delay delay is approximately 50%. 0% 0.02% 0% 0.06% 9.79ms 221.96ms We also repeated all simulation scenarios ten times with 1% 0.02% 2.05% 2.08% 9.79ms 221.96ms different randomly selected starting points in the radio link error pattern files and in the PS domain delay distribution file 3% 0.02% 6.02% 6.08% 9.79ms 221.96ms to make sure that the results are not affected by some local 5% 0.02% 10.26% 10.31% 9.79ms 221.96ms anomaly in the simulated network conditions. III. SIMULATION RESULTS The overall frame error rate (FER) can be used as a rough Since a fixed-delay jitter buffering scheme was assumed in objective speech quality estimate. Typically, with AMR codec the receiving terminal, the total end-to-end delay is fixed the speech quality can be still considered good when FER is throughout the session. However, because of the TTI structure, around 1-2%, but it should be noted that also the distribution packet segmentation at the RLC level and ROHC behaviour of frame losses has an effect on the subjective speech quality. the network delay on packet level is not fixed throughout the connection. Packets too large to be carried by a single TTI IV. CONCLUSIONS need to be segmented over several TTIs thus introducing Our end-to-end Quality of Service analysis shows that 3GPP longer transmission delay over a radio link, in most cases in networks will be able to offer an adequate level of quality for both uplink and downlink. Furthermore, some of the Voice over IP (VoIP) services. The difference in QoS with subsequent packets following the large packets are also current voice services technology (CS voice) is very small: segmented over two TTIs although in principle they could fit The additional packet loss ration introduced by packet into single TTI because they are not aligned with the TTI switched characteristics is less than 1%, whereas the end-to- structure. The reason for this is that when these packets are end network average delay is expected to be around 220ms. obtained from the buffer, there is still some room in the tail of The enabling features for the obtained quality level are the current radio frame, and as much data as possible from the summarized below: beginning of the next packet, if available, are carried here. N WCDMA unacknowledged mode in radio For these scenarios there are two causes for frame losses (packet losses); a packet can be lost on the radio path due to N ROHC at the PDCP layer that allows usage of limited transmission errors, or a packet can be dropped due to bandwidth radio bearer (16kbits/s). buffering, either in transmitting terminal, in DL RNC or in receiving terminal. We would like to point out one observation N Relevant buffering limits and discarding rules in the regarding frame losses: the observed packet loss rate in the PDCP buffer (DL) and in the transmitting terminal to radio link seems to be slightly higher than the nominal frame avoid potential cumulative delay and jitter. error rate specified for the error patterns over all radio FER N Differentiated Services support in the core network and conditions. Because of the segmentation a loss of single radio backbone to ensure minimal buffering delay in packet frame can cause loss of two packets: when a radio frame switched domain. carrying data from two separate packets is lost, both these packets will be unusable and will be dropped by the receiver. Nevertheless there are few other important aspects that The simulation results are summarized in Table 1. The require further investigations in order to determine if VoIP results include packet loss rate (PLR) and buffering time services will be quickly deployed in 3G networks. statistics, as well as end-to-end delays in different scenarios. 1. The User Equipment (UE) may contribute to the mouth- There is also a further breakdown of packet loss statistics into to-ear delay: The processing time needed to compress the losses due to DL buffering and losses due to transmission VoIP headers should not be negligible. Also, because the errors on the radio path. Since we carry one AMR frame per first few packets during the ROHC initialisation phase are packet the FER at speech decoder input equals PLR. Note that transmitted with full headers, the UE may require special losses in the UL terminal buffering are not presented in the buffering mechanism in order to minimize the delay. table, but they are included in the total packet loss rate. Note 2. The radio capacity needed to transfer VoIP flows is slightly higher than the capacity needed for sending circuit
  • switched voice frames, even with header compression. A detailed analysis, that would take pricing into account, would be useful to determine if offering VoIP services in 3G networks is efficient and interesting from an operator business perspective. ACKNOWLEDGEMENTS The authors wish to thank Zhi-chun Honkasalo and Mattias Wahlqvist for their frequent feedback along this study. Mika Kolehmainen and Outi Hiironniemi also contributed to this work by providing support for radio link error and PS domain delay modelling. REFERENCES [1] IETF Session Initiation Protocol (SIP) Working Group, http://www.ietf.org/html.charters/sip-charter.html [2] IETF Differentiated Services (DiffServ) Working Group, http://www.ietf.org/html.charters/diffserv- charter.html [3] ITU-T Recommendation G.114, “One-way transmission time”, 05/2000 [4] RFC 3095, “RObust Header Compression (ROHC); Framework and four profiles: RTP, UDP, ESP, and uncompressed”, July 2001