Bit Allocation for Scalable Video Streaming over
                    Mobile Wireless Internet
description of WMSTFP and its preliminary simulation results           II. ARCHITECTURE FOR END-TO-END SCALABLE
  first ap...
status using Reed-Solomon codes. This is because different              feeds back the measured parameters (RTT, loss rati...
with all (Ts, Tr) values, we find the computing result of all the          IV. LOSS DIFFERENTIATED R-D BASED BIT
  Ts and ...
the available network status (e.g., bandwidth, different loss                 a video clip, Foreman sequence, with the fix...
where Dl is the distortion when the bitstream is truncated at the                               this problem using optimiz...
where t = (n-k)/2. Pcongestion and Pwireless are respectively the                   throughput of WMSTFP connections as ...
results show that WMSTFP can compete fairly with TCP under             Fair comparison of PSNR among the above three schem...
Table I depicts the comparison results of the average PSNR               Figure 9 shows the reconstructed frames of Forema...
module in the end-to-end streaming architecture. Since                         [10] J. Liu, I. Matta, and M. Crovella, “En...
Upcoming SlideShare
Loading in …5

Bit Allocation for Scalable Video Streaming over Mobile ...


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Bit Allocation for Scalable Video Streaming over Mobile ...

  1. 1. Bit Allocation for Scalable Video Streaming over Mobile Wireless Internet Fan Yang1*, Qian Zhang2, Wenwu Zhu2, Ya-Qin Zhang2 1 Dept. of Computer Science, Nanjing University,No. 22, Hankou Road, Nanjing, 210093, China 2 Microsoft Research Asia, No. 49, Zhichun Road, Haidian District, Beijing, 100080, China email:, {qianz,wwzhu, yzhang} Abstract—With the convergence of wired line Internet and channel errors. This will further increase the dramatic variation mobile wireless networks as well as tremendous demand on video of bandwidth and delay in wireless network. In mobile wireless application in mobile wireless Internet, it’s essential to design an Internet, all the above characteristics bring great challenges for effective video streaming protocol and an efficient resource video streaming. We argue that video streaming applications allocation scheme for video delivery over wireless Internet. In should be aware and adaptive to the change of network this paper, we employ WMSTFP, an end-to-end TCP-friendly conditions of the wireless Internet, i.e., quality of service (QoS) multimedia streaming protocol, to detect the status of the wired fluctuations, so as to get good video quality. This adaptation and wireless part of the wireless Internet, where only the last hop consists of network adaptation and media adaptation. is wireless link. By accurately distinguishing the packet losses due to transmission errors from the congestive losses and smoothing Network adaptation refers to how many network resources out the pathologic round-trip-time values caused by the highly (e.g., bandwidth) a video streaming application should utilize dynamic wireless environment, in WMSTFP higher throughput for video content, resulting in designing an adaptive streaming in wireless Internet can be achieved and rate can be adjusted in a protocol for video transmission. Usually a media streaming smooth and TCP-friendly manner. Based upon WMSTFP, we protocol chooses its transmission speed in accordance with the propose a novel loss pattern differentiated bit allocation scheme measured packet loss ratio and round trip time (RTT). while applying unequal loss protection (ULP) for scalable video Therefore, the correct network adaptation relies on the accurate streaming over wireless Internet. Specifically, a Rate-Distortion measurement of packet loss ratio and round-trip-time. However, (R-D) based bit allocation scheme which considers both wired it’s not easy to accurately measure the two parameters in and wireless network status is proposed to minimize the expected end-to-end distortion. The optimal solution of global optimization wireless Internet as it is in wired-line networks. Firstly, in for the bit allocation scheme is obtained by a local search wireless Internet scenario, the packet loss, which is considered algorithm taking the characteristics of Progressive Fine to be a signal of network congestion, can be caused by either Granularity Scalable (PFGS) video into account. Analytical and network congestion occurred in wired networks or transmission simulation results demonstrate the effectiveness of our proposed errors occurred in wireless links. Although we can install an schemes. agent at the edge of wired and wireless network to discriminate congestive losses from erroneous losses [3-7], the additional intermediate agents introduce the deployment issue for network I. INTRODUCTION operators. We could also adopt some heuristic scheme such as Streaming video over Internet has been very active for the inter-arrival time and packet pair to differentiate packet loss past several years [1] and has been commercialized. Recent type from end hosts [8-11]. But the accuracy of the loss type advances in wireless communication technology and the differentiation is doubtful. Secondly, in order to minimize the increasing computing capability of mobile devices make reverse traffic, many streaming protocols acknowledge only streaming video over wireless of great interest [2]. As different one packet to measure the RTT within a pre-determined period types of wireless networks are converged into all IP networks of time. However in wireless environment, the aforementioned and into the Internet, it’s important to study video streaming bandwidth and delay fluctuation cause a dramatic variation of over the converged wireless Internet. RTT value. That is to say, the rate estimation based upon RTTs may not be able to reflect the correct status of the wireless It is well known that bit error rate (BER) of wireless networks and could suffer from great jitters. Another important network is much higher than that in wired-line links. Moreover, issue is that the designed streaming protocol should be friendly the varying wireless environment also results in dramatic to TCP protocol that is widely used in today’s Internet [12, 13]. fluctuation of network bandwidth and delay. Specifically, the bandwidth and delay may vary with the changing distance We employ WMSTFP, the end-to-end TCP-friendly between the base station (or access point) and mobile host. multimedia streaming protocol for a network where only the Furthermore, link layer error control scheme like automatic last hop is wireless link, to address the above issues. WMSTFP repeat request (ARQ) is widely used to overcome the wireless can differentiate the loss type in an end-to-end manner based upon the link layer information and can characterize the packet * Work performed when Fan Yang was a visiting student at Microsoft loss behaviors taking the loss burstiness into account. Research Asia. Moreover, it can filter out the abnormal RTT values caused by the highly varying wireless environment. Note that a 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  2. 2. description of WMSTFP and its preliminary simulation results II. ARCHITECTURE FOR END-TO-END SCALABLE first appeared in [14], we regard it in this paper as a building VIDEO STREAMING OVER WIRELESS INTERNET block in the architecture for scalable video streaming over Figure 1 depicts a general scenario for video streaming over wireless Internet and practically evaluate its effectiveness using wireless Internet. In this scenario, streaming server resides in scalable video codec. the Internet and streaming client accesses to the Internet Media adaptation is about how to control bit-rate of a video through the last mile wireless connection. Since wireless access stream and adjust error control behaviors according to the and Internet backbone have their individual network condition, varying wireless Internet conditions. Forward Error Correction usually it is difficult to achieve the optimal end-to-end (FEC) and ARQ are two well-studied error control techniques. perceived media quality. In this work, streaming protocol and Of these two error control mechanisms, FEC has been resource allocation are studied in the sender and receiver to commonly suggested for real-time applications due to strict achieve such a goal. time constraints and the error-tolerance nature of media streams. Studying how to add FEC to scalable video coding is of great Wireless interest recently because of scalable video’s scalability in Access bitstream and therein friendliness to network [15]. A scalable encoder usually encodes a raw video sequence into base layer, Internet Base Station Streaming (Access Point) which has more important data, and multiple enhancement Server Streaming layers. Because different part of the compressed video Client bitstream has different importance, naturally for scalable video, we can use FEC codes to protect them with different strength, i.e., unequal error protection. However, to decide how many Figure 1. General illustration of video streaming over wireless Internet. FEC codes we should allocate to different part of video bitstream is a challenging task in wireless Internet environment. Figure 2 depicts the detailed diagram of our system The recent work on unequal error protection and bit-allocation architecture for end-to-end scalable video streaming over for video delivery over wireless network and wired network wireless Internet. The key components in this architecture can be found in [16-18] and [19-21], respectively. But in their consist of WMSTFP Congestion Control and WMSTFP bit allocation schemes, they treat the erroneous losses and the Network Monitor performed by our designed media streaming congestive losses the same and only consider one type of protocol WMSTFP, Network-adaptive ULP Channel Encoder, packet loss. As stated above, in wireless Internet environment and Loss Differentiated R-D Based Bit Allocation. Here we the packet losses consist of both congestive losses and adopt PFGS as an example of layered scalable video. erroneous losses, which in turn have different loss patterns in wireless and wired network parts. As different loss patterns PFGS Source Encoder Network-adaptive ULP Channel Encoder lead to different perceived QoS at application level [22], the Internet aforementioned bit allocation schemes can not achieve good Loss Differentiated video quality since the joint effects of their loss patterns have R-D Based Bit Allocation WMSTFP Congestion Control Base Station not be accurately reflected. Data Flow To address those issues in media adaptation, we propose a Control Flow WMSTFP WMSTFP Network Mointor Channel Decoder PFGS Source Decoder loss differentiated rate-distortion (R-D) based bit allocation scheme for Progressive Fine Granularity Scalable (PFGS) video [23] streaming over wireless Internet while employing a Figure 2. System architecture for scalable video streaming over wireless Internet. network adaptive unequal loss protection (ULP) scheme. More specifically, we notice that the packet losses due to network WMSTFP Congestion Control and WMSTFP Network congestion and those caused by wireless errors result in Monitor provide network adaptation at end hosts, which mainly different perceived QoS in video streaming. Consequently, our deal with probing and estimating the dynamic network proposed bit allocation scheme distributes the available bit conditions using WMSTFP. Specifically, the WMSTFP resource between source bits and channel protection bits by Congestion Control module adjusts sending rate on the sender taking the loss pattern difference between congestive packet side based on the feedback information, and the WMSTFP losses and erroneous packet losses, differentiated by WMSFTP, Network Monitor module on the receiver side analyzes the into account so as to improve the end-to-end video quality. erroneous loss and congestive loss rate caused in wireless The rest of this paper is organized as follows. In Section II, connection and wired connection and estimates the end-to-end we propose an end-to-end architecture for scalable video available network bandwidth. The control data consisting of the streaming over wireless Internet. We present a TCP-friendly estimated network bandwidth and other related network status streaming protocol for video over wireless Internet in Section parameters, such as congestive packet loss rate and erroneous III. In Section IV, network adaptive unequal packet loss packet loss rate, and smoothed packet transmission time, are protection and rate-distortion based bit allocation are discussed. fed back to the sender. Section V gives simulation results and Section V concludes this Network-adaptive ULP Channel Encoder module protects paper. different layers of PFGS against congestive packet losses and erroneous losses according to their importance and network 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  3. 3. status using Reed-Solomon codes. This is because different feeds back the measured parameters (RTT, loss ratio, etc.) to parts of PFGS video bitstream have different quality impact, the sender every round trip time. Based upon the feedbacks thus it’s desirable to add different protection level according to from the receiver, the sender adjusts its transmission rate TCP- the importance of the scalable video bitstream. friendly. The whole process of the protocol consists of the following steps: (1) measuring loss rate (congestive and Loss Differentiated R-D Based Bit Allocation module erroneous); (2) measuring round trip time (RTT) and performs media adaptation control to make the total sending retransmission time out (RTO); (3) computing available rate adapt to the estimated network conditions. Specifically, network bandwidth and adjusting the sending rate. based on the feedback information from the receiver, this module on the sender side distributes the total sending rate At beginning of the connection, the sender adopts a slow between video bit rate and error protection rate according to the start algorithm to probe available bandwidth. The slow start available bandwidth and different packet loss conditions in stops right after a congestive packet loss happens. After slow wired and wireless connections. start, the sender adjusts its sending rate based on the congestive packet loss ratio, RTT and RTO. The estimated packet error III. MULTIMEDIA STREAMING TCP-FRIENDLY ratio is reported to multimedia application for QoS adaptation. PROTOCOL FOR WIRELESS INTERNET During the whole procedure, the receiver measures the incoming packets and estimates the wireless Internet status. For video streaming application, it is desirable to adjust its Specifically, based on the sequence number of the successfully transmission rate according to the perceived congestion level in received packets and the wireless channel status information, the network to maintain a suitable loss level and fairly share the receiver calculates the congestive packet loss ratio and bandwidth with other connections. Furthermore, it’s favorable wireless packet error rate. It also gets the timestamp for the streaming applications to be aware of the transmission information from every incoming packet and helps the sender quality of wireless channel to obtain good streaming quality by to estimate RTT more accurately. The estimated packet loss appropriate error protection. With the above considerations and ratio and the timestamp information are sent back to the sender taking the characteristics of wireless Internet environment into every round trip time. account, here we employ the TCP-friendly multimedia streaming protocol for mobile wireless Internet, WMSTFP, B. RTT and RTO estimation which consists of the features as follows. In traditional wired-line Internet, the round trip time will • Accurate loss differentiation. WMSTFP can accurately not fluctuate sharply within a relatively small time scale (e.g., detect packet losses caused by errors in wireless during a round). However, this statement cannot hold in channel using the information acquired at the link layer. wireless Internet due to the link layer ARQ mechanism adopted By jointly using the status information at link layer and in wireless links. In most cases, bursty variation and occasional the sequence number of incoming packets, we can deep fades in wireless channels result in spikes in the observed effectively differentiate different types of packet losses RTT values. in wireless Internet. In order to suppress the spikes in the measured RTTs • Forward loss ratio estimation. We observed that without causing extra traffic and power consumption in the packets had different loss patterns, i.e., different loss reverse path, we need to collect as many RTT observations as burstiness length in different types of networks. In this possible. We store Ts, the time a packet is being sent which is work, we use two Gilbert models to describe the piggybacked in the header of a data packet, and Tr, the time the burstiness of these two types of packet losses, packet is arriving, in an array at the receiver. The receiver respectively. Consequently, we can forwardly estimate sends a feedback every RTT. By calculating the difference packet loss ratio and packet error ratio. between Tr-Ts of the last packet and Tr-Ts of other packets in this round, the sender can not only measure the RTT value of • Smoothed RTT measurement. Since the noisy wireless the last packet triggering the feedback, but also approximate channel introduces large delay variations, the packet the RTTs of other packets during the round. This method also round trip time will fluctuate sharply. Therefore, cancels the clock offset between the sender and receiver so that choosing a single packet as an observation sample can clock synchronizing between them is unneeded. not accurately reflect this variation. In WMSTFP, we propose a method to measure the "average" round trip Note that the array of (Ts, Tr) is a considerable overhead if time during a period of time. As a result, the rate it’s piggybacked in the feedback packet. To minimize the adjustment performs more smoothly while achieving traffic in the reverse path, we find out that, to get the final comparable throughput. estimated RTT value, all the RTTs in a round are iteratively fed into an EWMA (Exponentially Weighted Moving Average) In the following sub-sections, we will describe the basic filter as the traditional TCP protocol does. Mathematically, functionalities and processes of WMSTFP in detail. A. Sender and receiver functionality RTTn = α RTT n −1 + (1 − α ) RTTn , (1) WMSTFP consists of a sender part and a receiver part. The sender located in the wired network delivers data packets at a where α is set to 0.75 considering the need of real-time certain rate. The receiver monitors the incoming packets and applications, and RTT is the estimated RTT. Expanding (1) 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  4. 4. with all (Ts, Tr) values, we find the computing result of all the IV. LOSS DIFFERENTIATED R-D BASED BIT Ts and Tr values can be obtained on the receiver side [14], ALLOCATION FOR SCALABLE VIDEO STREAMING which can be carried by the feedback packet. Therefore, the OVER WIRELESS INTERNET minimum overhead in reverse path is achieved. With the In this section, we introduce a network-adaptive ULP estimated RTT, we compute the retransmission time out (RTO) scheme for scalable video bitstream according to its importance following the instruction in [24]. The sender halves the of different layers. An R-D based bit allocation scheme is transmission speed if no feedback has arrived for a period of further proposed to decide how to distribute between source RTO time. bits and channel protection bits considering the packet loss In addition to the timestamp information, i.e. Ts, the latest patterns of both wireless erroneous packet loss and congestive estimated RTT value is also contained in the header of data packet loss. A major challenge in bit allocation over wireless packets to inform the receiver of the time to send feedbacks. Internet is that, in contrast to traditional wired line Internet, the perceived video quality is affected not only by the bandwidth C. End-to-end packet loss differentiation and measurement variation and the packet losses due to network congestion, but As stated in Section I, most current end-to-end methods to also the bandwidth fluctuation and the packet losses caused by differentiate packet losses suffer the flaw of inaccurate loss channel errors. type differentiation. In this work, we find out that it’s possible As mentioned earlier, different packet loss patterns cause to exploit the link layer information in wireless channels to different effects on upper applications [22, 26]. By our study, discriminate between the erroneous losses and congestive we found out that the packet losses caused by network losses. For example, in the third generation (3G) wireless congestion are independent of those due to wireless errors. communication system, the information provided by radio link Moreover, the packet loss patterns caused by those two reasons control layer (RLC) [25] can be used to deduce a packet loss are quite different. That is to say, the packet losses due to caused by transmission errors. According to the mechanism in network congestion and those caused by wireless errors result RLC layer, IP packets should be fragmented into several RLC in different perceived QoS in video streaming application. Thus frames before transmission and be reassembled on the receiver we need to treat those different loss patterns differently. We side. During this period, transmission fail of any RLC frame use two Gilbert models to measure the patterns of the two types will result in an assembly failure of the whole IP packet and the of packet loss. Based upon the two models, we can effectively consequent drop of this IP packet. Because the assembly failure perform bit allocation taking both the network congestion and notification can be obtained on the receiver side, we can use the wireless channel error into account. The details will be this information to infer whether a packet is dropped due to discussed subsequently. corruption. In conjunction with the sequence number of the adjacent correctly received packets on the receiver side, we can A. Network-adaptive ULP and loss pattern effects on the differentiate erroneous packets from the congestive lost perceived video quality packets. PFGS is a multilayer scalable video codec. It stores the In addition to the loss type differentiation, it's believed that most important information such as motion vector, etc. in the the loss pattern is also an essential factor that determines the base layer. Other information such as texture is bit-plane coded quality observed by users for real-time multimedia applications and stored in several enhancement layers. Different layers of [22, 26]. One of the most important metrics that characterize the same frame are correlated in PFGS. To be specific, the loss pattern is burst length of packet loss [26]. To capture the higher layer information relies on the corresponding ones in the length as well as the frequency of packet losses, we use two lower layers. The layer-dependency characteristic of PFGS Gilbert models to measure the two types of losses [27]. codec is well suited for prioritized transmission. Since the current Internet only provides best-effort service, prioritized When multiple packet losses occur where both erroneous transmission can be achieved by applying ULP scheme to packet and congestive lost packet co-exist, we can’t know different layers. In our work, ULP is achieved by protecting exactly which packets are lost due to congestion and which are different layers with different FEC codes. More specifically, lost caused by wireless errors merely based on sequence strong channel-coding protection is applied to the base layer number and link layer information. Here we use the arrival data stream while weaker channel-coding protection is applied time of the erroneous packets to derive the distribution of lost to the enhancement layer parts. packets among the erroneous packets between two back-to- back correctly received packets. The adjacent erroneous We use Reed-Solomon (RS) codes for FEC channel coding packets with larger time interval may contain more congestive across video packets. An RS code denoted as RS(n, k) lost packets. With this principle, we can identify the represents an n symbol length code which contains k source distribution of congestive lost packets among the erroneous symbols and n-k protection symbols (redundant data). packets [14]. Generally RS(n, k) can correct t = (n-k)/2 symbol errors. That is to say, if at least n-t out of n symbols are correctly received, After the estimation of the congestive packet loss ratio, the underlying source information can be correctly decoded. round trip time (RTT) and retransmit time out (RTO), we Otherwise, none of the lost symbols can be recovered by the compute the available bandwidth as described in [28]. With the receiver. available bandwidth, the sender can adapt its sending rate in an additive increase multiplicative decrease (AIMD) manner [14]. Now the issue remains is how to allocate target bits between source coding and channel protection coding based on 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  5. 5. the available network status (e.g., bandwidth, different loss a video clip, Foreman sequence, with the fixed total target bit patterns for wireless and wired-line part). In general, when the rate over the network. The packet loss ratio of this network is network is in good status, i.e., packet loss ratio and error rate 5% and its packet loss burst length is 5. If one does not are low, more bit budget should be assigned for source coding consider this correlation of packet loss, that is to say, one treats and fewer bits should be assigned for channel coding. On the the packet loss randomly distributed, the perceived video contrary, when network condition is bad, it's necessary to quality will be largely degraded compared with the one that allocate more bits for channel coding, thus fewer bits should be takes this loss correlation into account. Figure 4 illustrates the allocated for source coding. This method enables us to add the 55th frame of the Foreman sequence delivered by the two necessary level of redundancy in channel coding while schemes. We can easily see that the quality of the right image providing optimal video quality. obtained by the scheme which doesn’t consider the loss pattern is very poor; while a significant better quality can be achieved by using the scheme with the knowledge of packet loss pattern. 0.9 0.8 B. Loss differentiated R-D based bit allocation Protection ratio 0.7 Taking different loss patterns in wireless and wired network 0.6 into account, in this subsection we present a loss differentiated 0.5 R-D based bit allocation scheme, which minimizes the Packet Loss Ratio 5% expected end-to-end distortion through optimally allocating bits 0.4 Packet Loss Ratio 3% between source data and channel data. The bit allocation 0.3 Packet Loss Ratio 1% problem in here is to find a solution of that how much 0.2 protection should be added to the source data so that the 1 2 3 4 5 decoder can efficiently recover the lost data dropped in the Packet Loss Burst length (pkt) intermediate router and the corrupted data dropped by wireless link during transmission while not wasting much bandwidth. Figure 3. Protection ratios under different burst lengths. In wireless Internet, the end-to-end distortion DT consists of source distortion Ds and channel distortion Dc. The source As mentioned above, different loss patterns have different distortion is caused by source coding such as quantization and impacts on the perceived QoS quality in video streaming. Next, rate control. The channel distortion occurs when a packet loss we experimentally study the effects of the perceptual video due to network congestion or errors in a wireless channel quality with different loss patterns in wireless Internet. Let the happens during transmission. Without losing generality, we can packet loss ratio of this network be fixed (1%, 3%, and 5% in assume that Ds and Dc are uncorrelated, thus the end-to-end our study) while the loss pattern, i.e., burst length of packet distortion can be presented as losses, varies. We add protection data by applying RS code to this video bitstream so that its target residual packet loss ratio DT = D s ( R s ) + D c ( R s , R c ) , (2) decreases to only 0.01%. Figure 3 shows the protection ratio comparison for different loss patterns of a video bitstream over where Ds is a function of source coding rate Rs, and Dc is a a network. Specifically, when packet loss ratio is 5%, if the function of Rs and channel coding rate Rc. In this work, we burst length of packet losses is 1, 38.9% protection data are address how to optimally distribute bits between source coding needed for video delivery. However, if the burst length of and RS coding based upon the above R-D function. packet losses is 5, 83.0% protection data are needed. This Specifically, the solution of bit allocation is periodically simple study shows that it is quite essential that we consider the performed according to the estimated network bandwidth, RT, different loss pattern effect on perceived video quality when we obtained by WMSTFP. Then, this problem becomes to allocate allocate bits over different types of networks, wireless and the available bit rate such that the optimal Rs and Rc are Internet, for video over wireless Internet. obtained by minimizing end-to-end distortion under the constraint Rs+Rc≤RT. That is, min DT = D s ( R s ) + Dc ( R s , Rc ) { Rs , Rc } (3) subject to R s + Rc ≤ RT . As PFGS is a layered scalable video codec, it can be truncated anywhere in enhancement layers. Assume the video bitstream can be cut by the source rate Rs and can be stopped at layer l, then the source distortion Ds can be expressed as Figure 4. Comparisons of the 55th reconstructed frame under different bit allocation schemes. ( Rl − R s ) Dl −1 + ( Rs − Rl −1 ) Dl Ds ( Rs ) = , (4) To further investigate the impact of accurately loss pattern ( Rl − Rl −1 ) estimation on the perceived video quality for PFGS, we deliver 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  6. 6. where Dl is the distortion when the bitstream is truncated at the this problem using optimization methods such as Lagrange end of layer l with rate Rl. Because of the insertion of the multiplier or penalty function method. In this work, we search resynchronization markers and the similar statistical distortion the global optimized solution by using the policy that takes the property of the same enhancement layer [23], the distortion following characteristics of PFGS video into account: reduction at the same enhancement layer can be linearly calculated approximately. • According to the dependency between different layers, we allocate bits to higher layer if and only if all data in Note that we consider a wireless Internet connection where the lower layer are allocated. only the last hop is wireless. In this case, the channel distortion is caused by two parts: one is caused during the transmission • Since the lower layer video bitstream has higher over wired-line part of the connection, Dc,wired, and the other is importance, the channel protection level of higher layer caused during the transmission over the wireless channel, data can not exceed that of lower layer. Dc,wireless. Because the independence of these two distortions, The optimization algorithm is listed as follows. we represent the channel distortion as Step 1: Set the current minimum end-to-end distortion DT_min to Dc ( Rs , Rc ) = Dc , wired ( Rs , Rc ) + Dc , wireless ( Rs , Rc ) . (5) a very large value (e.g., 65025 in our work) and allocate all available bit budgets to source data. PFGS codec adopts bit-plane coding to encode the Step 2: Find the optimal solution based upon the current quantized DCT coefficients. Because the dependency among distribution of source and channel budget with the constraints the bit planes, any corrupted macro-block in the lower layers in that higher layer protection level can't exceed that of lower will cause the corresponding macro-blocks in the higher layers layer. Replace DT_min with the current calculated end-to-end un-decodable. Assume the number of packet needed to be sent distortion DT’ if DT_min > DT’. in the ith layer is ni, Dc,wired and Dc,wireless can be respectively Step 3: Decrease the bit budgets assigned to source bitstream. If represented as the current source distortion Ds is larger than distortion DT_min with the solution Rs_min and Rc_min or the source budget decreases to zero, the search ends and the global optimized l  ni  Dc ,wired ( Rs , Rc ) = ∑ ∑ Pc ,layer (i, j ) × Dc (i, j ) and solutions are found. Otherwise, go to Step 2. i =1  j =1  (6) The last problem we are tackling now is how to get l   ni Pc,packet,layer(i, j) and Pw,packet,layer(i, j). As mentioned before, to Dc ,wireless ( Rs , Rc ) = ∑ ∑ (1 − Pc ,layer (i, j )) × Pw,layer (i, j ) × Dc (i, j ), i =1  j =1 capture the loss patterns of wireless and wired-line networks,  we should consider the burstiness of packet loss when where Dc(i, j) is the distortion increment caused by the loss of calculating the packet loss ratio caused by congestion or the jth packet in the ith layer. Pc,layer(i, j) is the probability that wireless errors after FEC recovery. Frossard et al. [29] the jth packet in the ith layer is lost due to congestion while the proposed a method for calculating the packet failure probability corresponding packets in the lower layers are correctly received. after FEC recovery considering the packet loss ratio and Pw,layer(i, j) is the probability that the jth packet in the ith layer average burst length. For video packets protected by RS code is lost due to wireless errors while the corresponding packets in RS(n, k), similar to [29], we get the lower layers are all correctly received. Considering the dependency among different layers, Pc,layer(i, j) and Pw,layer(i, j) can be further represented as  pcongestion t  n−k k  1− pcongestion  ∑i • H(i, k) ∑H( j +1, n − k +1) + ∑i • H(i, k) + •  k i=0  j =t −i +1 i =t +1   k  (8)   k −1(k −i) • S(i, k)n−t −1Si( j +1, n − k +1) + k−t −(k − i) • S(i, k) − 1 i −1  ∑ ∑ ∑  t< k     P , packetlayer(i, j) =  i=k−t   Pc,layer (i, j) = P , packet,layer(i, j) • ∏  ∏ (1 − Pc, packet,layer(u, v)) and j =0 i =1 c , c  pcongestionk n−k u =1 v∈ωu  (7)  ∑i • H(i, k) ∑H( j +1, n − k +1)  k i=1 j =t −i +1 i −1    1− pcongestion −1k n−t −1−i Pw,layer (i, j) = Pw, packet,layer(i, j) • ∏  ∏ (1 − Pw, packet,layer(u, v)),  +  k ∑ (k −i) • S(i, k) ∑S( j +1, n − k +1) t≥ k  i=1 j =0 u =1 v∈ωu  and respectively, where Pc,packet,layer(i, j) and Pw,packet,layer(i, j) are respectively the probabilities of which the jth packet in the ith layer is lost due to congestion and wireless errors after FEC  pwireless  t n−k k  1 − pwireless (9) decoding. We will describe how to calculate these two  ∑i • H (i, k ) ∑ H ( j +1, n − k +1) + ∑i • H (i, k ) + •  k  i=0 j =t −i +1 i =t +1  k probabilities later. ωu is the set of the jth packet in the ith    (k − i) • S (i, k )  k −1 n−t −1−i k −t −1  i∑t ∑S ( j +1, n − k +1) + ∑(k − i) • S(i, k) t< k layer’s corresponding packets in the uth layer. Pw, packet,layer(i, j) =   =k − j =0 i =1   pwireless k n−k By substituting (4) ~ (7) into (3), the R-D based optimal bit  k ∑i • H (i, k ) ∑ H ( j +1, n − k +1) allocation for scalable video transmission can be obtained  i =1 j =t −i +1  1− pwireless k −1 n−t −1−i under the given total bit budget, RT. Note that (3) is a non-  + k ∑(k − i) • S(i, k) ∑S ( j +1, n − k +1) t ≥ k,  i =1 j =0 linear optimization problem with one constraint. We can solve 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  7. 7. where t = (n-k)/2. Pcongestion and Pwireless are respectively the throughput of WMSTFP connections as T1w , T2w , ..., Tkw and w congestive and wireless packet loss ratio that can be obtained those of TCP connections as T1t , T2t , ..., Tkt , respectively. Then by the two Gilbert models described in Section III.C. H(m,n) t and S(m,n) denote the probability that m-1 packet losses occur the average throughputs of WMSTFP and TCP connections are in the next n-1 packets following one packet loss and the probability that m-1 packets are received in the next n-1 kw kt Ti w Tit packets following one received packet, respectively, which can TW = ∑i =1 and TT = ∑i =1 , (10) be calculated similarly as in [29], therein the Gilbert models are kw kt also used. respectively. The friendliness measure can be defined as V. SIMULATION RESULTS In this section, we implement our proposed end-to-end F = TW / TT . (11) video streaming architecture and the relevant algorithms. The purpose of this section is to demonstrate that (1) WMSTFP can achieve good performance in wireless Internet. Specifically, it Note that the closer to 1 the value of F is, the friendlier is friendly to TCP in wired line network, and can achieve WMSTFP is to TCP. higher throughput in varying wireless network; (2) our R-D To study the “rate smoothness” of WMSTFP connections, based bit allocation scheme, which differentiates different let R 1 i , R wi , ..., Rww denote the sending rates at different time 2 S packet loss patterns in wireless and wired connections, can w i adaptively adjust source and channel rates to the available instances 1, 2, ..., Sw of the ith WMSTFP connection and Rt1k , estimated bandwidth with good perceived video quality. Rt2 , ..., RtSt represent the sending rates at time instances 1, 2, ..., k k Senders 1 Mobile Host WMSTFP WMSTFP St of the kth TCP connection, respectively, then the sending Sender 2 Base Station Wireless rate variation of WMSTFP and TCP connections are, link (TCP) Receiver 2 (TCP) respectively, defined as …… Bottleneck Link …… Sender n-1 (TCP) Receiver n-1 Sw St Sender n (TCP) ∆ Wi = ∑ Rw − Rw−1 and ∆ Tk = ∑ Rt j − Rt j −1 . j j (12) (TCP) i i k k Receiver n j =1 j =1 (TCP) Figure 5. Simulation topology. The smoothness measure is then defined as We use the network simulator (NS) version 2 [30] to study the performance of the WMSTFP protocol and the network S = ∆ Wi / ∆ Tk . (13) adaptive bit allocation scheme for PFGS video streaming over wireless Internet. As shown in Figure 5, we deploy a dumbbell network topology with one congested link to simulate the S≤1 means the ith WMSTFP connection is smoother than the Internet traffic. The senders reside on one side of the bottleneck kth TCP connection. link and the receivers are on the other side. All links except for the bottleneck link are sufficiently provisioned to ensure that 2500 TCP network congestion only happens at the bottleneck link. All Throughput (kb) 2000 WMSTFP links are drop-tail links. To simulate the dynamically changing 1500 wireless environment, we apply a selective ARQ protocol to 1000 the wireless link shown in Figure 5. In the wireless channel, all IP packets are first segmented into several RLC frames and 500 then reassembled on the other side. Once the IP reassembly 0 failure on the receiver side occurs, it will be reported to 0 5 10 15 20 25 30 35 40 45 50 55 60 WMSTFP immediately so that it knows an erroneous IP packet Time (S) was dropped at the RLC layer. In the following simulations, the background traffic is composed of several infinite-duration TCP-like connections. Figure 6. Comparisons of throughput for TCP and WMSTFP connections. To evaluate the TCP friendliness of WMSTFP, we replace A. Performance of WMSTFP the wireless link in Figure 5 with a wired-line link and conduct Here we define TCP-friendly in wireless Internet as the a simulation under the wired-line networks. In the simulation, ability of being friendly to TCP in wired line network and we let several WMSTFP connections compete with other TCP achieving better performance than TCP in wireless part [14]. connections. We find out that the “friendliness measure” To get a metrics of the “TCP-friendliness” [24], let kw denote obtained using (11) is 0.98. Figure 6 depicts the simulation the total number of the WMSTFP connections and kt denote the results of the throughput for different connections. These total number of TCP connections. We also denote the 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  8. 8. results show that WMSTFP can compete fairly with TCP under Fair comparison of PSNR among the above three schemes congested link. is an important issue. To get a fair protection level for the fixed ULP schemes, we first perform all the simulations of the AULP The smoothness measure calculated using (13) is 0.26 scheme and get the average protection ratio for base layer and between WMSTFP and TCP. This convincingly shows that enhancement layers. Then we apply those calculated average WMSTFP can adjust its sending rate in a smoother manner protection ratios to the fixed ULP schemes. than TCP. To evaluate the performance of WMSTFP in wireless 33 Internet, we compare the throughput of WMSTFP with that of 32 AULP-W TCP Friendly Rate Control (TFRC) [12]. Simulation results FULP-W 31 show that, when RLC frame error rate (FER) are set to 0.1, 0.2 FULP-T PSNR and 0.3, WMSTFP’s throughputs are 6%, 74% and 171% 30 higher than TFRC’s while the packet loss ratios between 29 WMSTFP and TFRC under the same FER only have 1% 28 difference. 27 We also disable the smoothed RTT estimation method of 300 400 500 600 700 800 900 1000 WMSTFP and compare its rate smoothness, throughput and congestive packet loss ratio with those of the original protocol Available Bandwidth (kbps) under different wireless channel conditions. The average rate (a) FER = 0.3 smoothness measure between the smooth-disabled protocol and the original one is 0.91, and the differences of the throughput 35 and the congestive packet loss ratio between them are 2% and 34 AULP-W 0.6%, respectively. These results show that, under varying 33 FULP-W wireless environment, WMSTFP can adapt its transmission rate FULP-T PSNR 32 in a smoother manner while maintaining comparable packet 31 loss ratio and throughput. 30 29 B. Performance of loss differentiated R-D based bit 28 allocation scheme 300 400 500 600 700 800 900 1000 To demonstrate the effectiveness of our proposed loss pattern differentiated R-D based bit allocation scheme over Available Bandwidth (kbps) wireless Internet, we conduct the simulation under wireless (b) FER = 0.2 Internet to test (1) Fixed ULP without loss pattern Figure 7. Comparison results of average PSNR of different tested schemes differentiation over TFRC which is denoted as FULP-T; (2) under different bit rates for Foreman sequence. Fixed ULP without loss pattern differentiation over WMSTFP which is represented as FULP-W; (3) Adaptive ULP over Figure 7 shows comparisons of the average PSNR with WMSTFP denoted as AULP-W which considers the burstiness different available bandwidth under different RLC FER. It can of both wireless and congestive packet losses. be seen that our proposed scheme achieves the best We perform simulations under varying bandwidth from performance under different available bandwidth and different 320kbps to 1Mbps and different frame error rate in the wireless quality of wireless channel. From Figure 7 (a), we observe that channel. The testing video sequence, Foreman, is used in the the higher available network bandwidth, the more efficient the simulations. Foreman is coded in CIF at a temporal resolution AULP-W scheme comparing with the other two schemes. This of 10 fps. The first frame is intra-coded and the remaining is because in our scheme we change the protection level frames are inter-coded. We use PSNR as a metric to measure according to the change of current available network video quality. For an eight-bit image with intensity values bandwidth. Meanwhile, AULP-W considers the loss burstiness between 0 and 255, the PSNR is defined as for congestive packet losses as well as erroneous packet losses. FULP-W outperforms FULP-T because the TFPC can not differentiate these two kinds of packet losses so that it PSNR = 20 log10 (255 / RMSE ) , (14) unnecessarily reduces its sending rate when the packet loss is due to the errors in wireless channel. However, the gain of where RMSE stands for root mean squared error. Given an AULP-W over FULP-W is not so significant when the FER of original N×M image f and the processed image f’, the RMSE the wireless channel is 0.2. In this case, the ARQ protocol in can be calculated as the wireless channel can effectively correct most wireless channel errors and the streaming protocol maintains a reasonable sending rate so that the maximum bandwidth utility 1 N −1 M −1 is obtained while the congestive packet loss ratio is minimized. RMSE = N ×M ∑∑[ f ( x, y) − f ' ( x, y)] 2 . (15) We can see that in this case the overall packet loss ratio is less x =0 y =0 than 4%. Under such a low packet loss ratio, we can't expect much gain of AULP-W. 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  9. 9. Table I depicts the comparison results of the average PSNR Figure 9 shows the reconstructed frames of Foreman for the whole sequence using those three tested schemes. It can sequence using AULP-W, FULP-W and FULP-T. From top to be seen that the gain of the average PSNR of AULP-W over bottom, the left column is the reconstructed 26th frame using FULP-W is about 1.0dB when FER is 0.3 and 0.4dB when AULP-W, FULP-W and FULP-T, respectively, while the right FER is 0.2. The gain of AULP-W over FULP-T is about 2.4dB column is the reconstructed 34th frame using the three schemes. in both cases. It can be seen from Figure 7 to Figure 9 that our proposed network-adaptive loss pattern differentiated bit allocation TABLE I. PSNR COMPARISON RESULTS FOR FOREMAN USING DIFFERENT scheme obtains better results than the other two schemes in PROTECTION SCHEMES UNDER DIFFERENT FER wireless Internet both subjectively and objectively. FER=0.3 320kbps 480kbps 640kbps 800kbps 960kbps AULP-W 28.684163 29.958773 30.874298 31.581641 32.258838 FULP-W 28.58115 29.163827 29.552004 30.293835 30.780962 FULP-T 28.018016 28.131335 28.538054 27.98044 28.30042 FER=0.2 320kbps 480kbps 640kbps 800kbps 960kbps AULP-W 29.896785 31.539206 32.537112 33.223339 33.857063 FULP-W 29.713924 31.214287 32.160342 32.59875 33.322332 FULP-T 28.997481 29.563558 29.88481 30.272214 30.562365 38 36 34 32 30 PSNR 28 26 FULP-W 24 22 FULP-T 20 AULP-W 18 0 20 40 60 80 100 Frame Number (a) FER = 0.3 42 40 FULP-W 38 FULP-T 36 AULP-W PSNR 34 32 30 (a) FER = 0.3 (b) FER = 0.2 28 Figure 9. Comparison of the reconstructed frames under different FERs. The 26 images in the top, middle and bottom row are reconstructed by AULP-W, 24 FULP-W, and FULP-T, respectively. 0 20 40 60 80 100 Frame Number VI. CONCLUSIONS (b) FER = 0.2 In this paper, we address the issue of how to effectively and Figure 8. The PSNR comparisons for Foreman using different tested robustly streaming scalable video over mobile wireless Internet. schemes at 720kbps. We first present the end-to-end streaming protocol for video streaming over wireless Internet from network QoS adaptation Figure 8 shows the comparison results of PSNR for the test point of view to achieve good performance in both wireless and sequence Foreman at 720kbps available bandwidth using Internet connections. Then an R-D based bit allocation scheme different schemes. We can see that, along the whole sequence, is proposed from media adaptation point of view to obtain good the AULP-W scheme outperforms the other schemes in most perceptual video quality. The network adaptation and media cases. Because we protect the lower layers stronger according adaptation together with a scalable video codec are integrated to the packet loss patterns as well as the packet loss ratios, the into a streaming architecture for effective and robust video lower layers are less likely to be corrupted during transmission. streaming over wireless Internet. The main contributions of this While for the other schemes, lower layer may be lost when the paper are summarized as follows. loss burst length is longer under the same packet loss ratio. It may result in very poor quality and sometimes lead to decoder (1) The streaming protocol, WMSTFP, which is friendly to crash. TCP in wired line IP network, and can achieve higher throughput than TCP in wireless network, is an important 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004
  10. 10. module in the end-to-end streaming architecture. Since [10] J. Liu, I. Matta, and M. Crovella, “End-to-end inference of loss nature in WMSTFP can effectively differentiate congestive packet losses a hybrid wired/wireless environment,” Boston Univ., Boston, MA, Comput. Sci. Dept., Tech. Rep., Mar. 2002. from erroneous packet losses in an end-to-end manner and can [11] S.Biaz and N.Vaidya, “Discriminating congestion losses from wireless suppress the spikes in the round-trip-time values due to the losses using inter-arrival times at the receiver,” in Proc. IEEE highly varying wireless environment, it can be efficiently Symposium on Application-Specific Systems and Software Engr. And adopted in wireless video streaming. Techn., Richardson, TX, Mar. 1999, pp. 10-17. [12] S. Floyd, M. Handley, J. Padhye, and J. Widmer. Equation based (2) A loss differentiated R-D based bit allocation scheme is congestion control for unicast applications. [Online]. Available: further proposed by applying the network-adaptive ULP scheme. It can periodically distribute the available bits between [13] J. Widmer, R. Denda, and M. Mauve, “A survey on TCP-Friendly the source bits and the channel protection bits according to congestion control,” IEEE Mag. Network, vol. 15, pp. 28-37, May/June dynamically estimated available network bandwidth and 2001. different packet loss patterns in wireless and wired line [14] F. Yang, Q. Zhang, W. Zhu and Y.-Q. Zhang, “An End-to-End TCP- network. The expected end-to-end distortion can be minimized Friendly Streaming Protocol for Multimedia over Wireless Internet”, in Proc. IEEE International Conference on Multimedia and Expo (ICME), using our proposed loss pattern differentiated bit allocation Baltimore, July 6-9, 2003, pp. II429-II432. scheme. [15] W. Li, “Overview of fine granularity scalability in MPEG-4 video The simulation results demonstrate that our proposed standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, pp. 301- 317, Mar. 2001. scheme can achieve better results than the one without proper [16] T. Zhang and Y. Xu, “Unequal packet loss protection for layered video network adaptation and than the scheme which doesn't consider transmission,” IEEE Trans. Broadcasting, vol. 45, pp. 243-252, June the different packed loss patterns from different types of 1999. networks. [17] J. Goshi, R. E. Ladner, A. E. Mohr, E. A. Riskin, and A. Lippman, “Unequal loss protection for H.263 compressed video,” in Proc. Data ACKNOWLEDGMENT Compression Conference (DCC), Snowbird, Utah, Mar. 2003, pp. 73-82. [18] G. Cheung and A. Zakhor, “Bit allocation for joint-source channel We would like to thank Dr. Feng Wu in Internet Media coding of scalable video,” IEEE Trans. Image Processing, vol. 9, group at Microsoft Research Asia for providing the source code pp.340-356, Mar. 2000. of PFGS for the simulations. [19] U. Horn, B. Girod, and B. Belzer, “Scalable video coding for multimedia applications and robust transmission over wireless channels,” in Proc. 7th Intl. Workshop on Packet Video, Brisbane, Mar. 1996, pp. 43-48. REFERENCES [20] J. Hagenauer, T. Stockhammer, C. Weiss, and A. Donner, “Progressive [1] M. Civanlar, A. Luthra, S. Wenger, and W. Zhu, “Introduction to the source coding combined with regressive channel coding for varying special issue on streaming video,” IEEE Trans. Circuits Syst. Video channels,” in Proc. 3rd ITG Conference on Source and Channel Coding, Technol., vol. 11, pp. 265-268, Mar. 2001. Munich, Jan. 2000, pp. 123-130. [2] C. Chen, R. Lagenduk, A. Reibman, and W. Zhu, “Introduction to the [21] G. Wang, Q. Zhang, W. Zhu, and Y.-Q. Zhang, “Channel-adaptive special issue on wireless communication,” IEEE Trans. Circuits Syst. unequal error protection for scalable video transmission over wireless Video Technol., vol. 12, pp. 357-359, June 2002. channel,” in Proc. SPIE Visual Communications and Image Processing [3] Bakre and B. Badrinath, “I-TCP: indirect TCP for mobile hosts,” in (VCIP), San Jose, CA, Jan. 2001, pp. 648-655. Proc. 15th International Conference on Distributed Computing Systems [22] W. Jiang and H. Schulzrinne, “Modeling of packet loss and delay and (ICDCS), Vancouver, May 30 - June 2 1995, pp. 136-143. their effect on real-time multimedia service quality,” in Proc. 10th Intl. [4] K. Ratnam and I. Matta, “WTCP: an efficient mechanism for improving Workshop on Network and Operating Systems Support for Digital Audio TCP performance over wireless links,” in Proc. 3rd IEEE Symposium on and Video (NOSSDAV), Chapel Hill, North Carolina, June 2000. Computer and Communications (ISCC '98), Athens, Greece, June 1998, [23] F. Wu, S. Li, and Y-Q. Zhang, “A framework for efficient progressive pp. 74-78. fine granularity scalable video coding,” IEEE Trans. Circuits Syst. Video [5] H. Balakrishnan, V. Padmanabhan, S. Seshan, and R. Katz, “A Technol., vol. 11, pp. 332-344, Mar. 2001. comparison of mechanisms for improving TCP performance over [24] Q. Zhang, W. Zhu, and Y.-Q. Zhang, "Resource Allocation for wireless links,” IEEE/ACM Trans. Networking, vol. 6, pp. 756-769, Dec. Multimedia Streaming Over the Internet", IEEE Tran. On Multimedia, 1997. Vol. 3(3), pp. 339-355, Sept. 2001. [6] G. Cheung and T. Yoshimura, “Streaming agent: a network proxy for [25] 3GPP Technical Specification 25.322 v5.2.0, “Radio Link Control media streaming in 3G wireless networks,” in Proc. IEEE Packet Video (RLC) protocol specification,” Sept. 2002. Workshop, Pittsburgh, April 2002. [26] R. Koodli and R. Ravikanth, “One-way Loss Pattern Sample Metrics,” [7] Y. Pei and J. Modestino, “Robust packet video transmission over RFC 3357, Aug. 2002. heterogeneous wired-to-wireless IP networks using ALF together with [27] H. Sanneck and G. Carle, “A framework model for packet loss metrics edge proxies,” in Proc. European Wireless Networks, Florence, Italy, based on loss runlengths,” in Proc. SPIE/ACM Multimedia Computing Feb. 2002. and Networking (MMCN), San Jose, CA, Jan. 2000, pp. 177-187. [8] S. Cen, P. Cosman, and G. Voelker, “End-to-end differentiation of [28] J. Padhye, V. Firoiu, D. Towsley and J. Kurose, “Modeling TCP congestion and wireless loss,” in Proc. ACM Multimedia Computing and throughput: a simple model and its empirical validation,” ACM/IEEE Networking, San Jose, CA, Jan 2002, pp.1-15. Trans. Networking, vol. 8, pp. 133-145, Apr. 2000. [9] D. Barman and I. Matta, “Effectiveness of loss labeling in improving [29] P. Frossard, “FEC Performance in Multimedia Streaming,” IEEE TCP performance in wired/wireless Network,” in Proc. 10 IEEE Intl. Communications Letters, vol. 5, pp. 122-124, Mar. 2001. Conf. Network Protocols (ICNP), Paris, France, Nov. 2002, pp. 2-11. [30] Network simulator 2 (Ns-2) [Online]. Available: 0-7803-8356-7/04/$20.00 (C) 2004 IEEE IEEE INFOCOM 2004