Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. BI-LEVEL/FULL-COLOR VIDEO COMBINATION FOR UBIQUITOUS VIDEO COMMUNICATION Keman Yu, Jiang Li and Shipeng Li Microsoft Research, Asia {i-kmyu, jiangli, spli} ABSTRACT and more and more portable devices such as handheld PCs, palm-size PCs and mobile phones are emerging. Wireless Ubiquitous video communication, visually calling anyone network connections are being deployed rapidly. The at anywhere anytime on any device might be the ultimate problems converge to a core challenge: how to develop a goal of modern communication service. As MPEG and video codec that works in the whole range of network H.263 based full-color videos are unable to cover low bandwidths on most of the portable devices? bandwidth network conditions, we develop a bi-level/full- An initial thought is to use MPEG or H.263. However, color video combination scheme to realize video in low bandwidth conditions, the resultant images of DCT communication in the whole range of bandwidth based coding including MPEG and H.263 are usually look conditions. In our scheme, bi-level video works at below like a collection of color blocks and the motions in scenes 56 Kbps bandwidth range and full-color video works at are discontinuous. In DCT based coding, low spatial over 33.6 Kbps bandwidth range. The overlapped frequency values that represent the “basic colors” of the bandwidth range 33.6-56 Kbps is used to smoothly switch blocks possess high priority. However, in some cases, the between bi-level and full-color videos. A bandwidth outline features of scenes may be more important than the estimation algorithm is applied to normal bi-level or full- basic colors of blocks. For example, in video color video status for rate control. A special bandwidth communication, facial expression represented by the capability probing algorithm is designed to determine the motions of the outlines of a face, eyes, brows and a mouth switch of bi-level and full-color videos. Experiments delivers more information than the basic colors of the face. showed that this scheme worked well on video These outlines can be represented using only two types of communication systems that ran on PCs, handheld PCs colors. This inspired us to develop a video form in which and palm-size PCs. each pixel is represented by only 1 bit. We call it bi-level video [4][5]. In low bandwidth conditions, bi-level video possesses clearer shape, smoother motion, shorter initial latency and much cheaper computational cost than does 1. INTRODUCTION DCT-based video coding. However, the visual quality of a bi-level video is not as good as that of a gray-scale or full- Communication service is one of the most frequently used color video. services in the world. Despite phone communication has In view of the advantages and limitations of both full- already been very popular, video communication is still color and bi-level videos, we consider in order to cover the not so common. The causes basically reside in four aspects: whole range of bandwidths it may be possible to combine first, video communication requires additional devices: at two types of video codec with each one working well in its least a video camera at the sending side, and a display at suitable bandwidth condition. If this is applicable, new the receiving side; second, conventional video questions emerge: how to realize bandwidth adaptation for compression technologies such as MPEG [1] and H.263 [2] both bi-level and full-color videos and how to smoothly [3] are still not satisfactory enough to produce video switch between bi-level and full-color videos when stream at currently popular bandwidth range e.g. 33.6 bandwidths are changing. Kbps or below; third, it is still inconvenient to implement In section 2, we will describe a bandwidth estimation video communication in small portable devices due to algorithm for bi-level video transmission. Section 3 will be their computational power, battery and network devoted to a specially designed bandwidth probing connection factors; fourth, some psychological factors algorithm and a bi-level/full-color video switch scheme. prevent video communication from being widely applied. Experiments results of bandwidth estimation and bi- Now, situations are becoming better and better. Video level/full-color video switch will be shown in Section 4. cameras are becoming common accessories of a computer, Conclusions and future directions are given in Section 5. 0-7803-7448-7/02/$17.00 ©2002 IEEE II - 245
  2. 2. 2. BANDWIDTH ESTIMATION FOR BI-LEVEL 3. BI-LEVEL/FULL-COLOR VIDEO SWITCH VIDEO TRANSMISSION Full-color video possesses higher quality but costs much In order to deliver best video quality at a given bandwidth, bit rate while bi-level video costs less bit rate but we must estimate the amount of actually available possesses lower quality. An ideal solution is to combine bandwidth of a network. Considering the real-time bi-level and full-color videos so that they can serve at their characteristic of our application, among various bandwidth respective suitable bandwidth ranges. Before we describe estimation and network congestion control approaches [6], bi-level/full-color switch scheme, we would first introduce we adopt receiver feedback approach [7][8]. In this how the rate control scheme of bi-level video was approach, reports fed back by a receiver in a specific time designed [4]. The rate control scheme of bi-level video is interval contain information of the number of lost packets realized using two factors: (1) the threshold of the and timestamps. After obtaining the receiver’s reports, the difference between corresponding pixel regions in two sender estimates the state of the network and makes successive frames, called the dissimilarity threshold and (2) adjustment decisions accordingly with performing the the width of the threshold band. The higher the following steps: dissimilarity threshold is, the more pixels are viewed as Feedback analysis: compute the statistic of packet loss being similar to corresponding pixels in the previous and round-trip time (RTT). frame, and therefore the lower bit-rate the generated bit Network state estimation with loss and delay: determine stream is. The wider the threshold band is, the more pixels the actual network state: unloaded, loaded or congested. are coded according to the predicted probability, and Bit rate adjustment: adjust the allowed bit rate of the therefore the lower bit-rate the generated bit stream is. If it application in terms of the network state. is not sufficient to adjust the generated bit-rate using the loss (%) delay (ms) above two factors, frame dropping is finally employed. 100 There are two major differences between the rate control schemes of bi-level video coding and DCT based congested congested full-color video coding. The first is that in DCT based packet loss RTT Rt coding, the quantization parameter can be calculated Lt loaded loaded Rb according to an encoder rate distortion function, but, in bi- Lb level video coding, no such distortion function exists. The unloaded unloaded only way is to increase or decrease the combination of the 0 above two factors. The second is that in DCT based Figure 1: Network state estimation. coding, both overflow and underflow need to be prevented, but in bi-level video, underflow is inevitable and therefore As shown in Fig.1. the lower threshold Lb of packet is allowed. The consequence of this feature is that the loss should be set so that data transmission may suffer generated bit rate of a bi-level video may not be as high as from packet loss but is still acceptable and the upper the target bandwidth. This is why we need a specially threshold Lt should be chosen to indicate congestion if the designed bandwidth capability probing scheme here. hurt to video quality resulting from packet loss is severe. The bandwidth capability probing scheme is developed Similarly, the upper threshold Rt should be chosen to the based on the bandwidth estimation algorithm described in maximum value so that the delay of video will not be Section 2. Usually, the bit rate of a bi-level video is much perceived evidently. On the other hand, the loaded zone smaller than that of a full-color video. There is a gap should be large enough, i.e. the lower threshold Rb should between the bit rate of a bi-level video and the switch be set low enough, to avoid oscillations. Suitable values threshold of a full-color video. Since the bandwidth for our video source are Lt=4%, Lb=2%, Rt =1200 and estimation scheme only tells us which status the current Rb=700. network is, not how much additional bandwidth the After estimating network states, we simply map them to network possesses, we have to send redundant data to decrease, hold and increase decisions respectively. probe. With a view to the characteristic of real-time The network capability probing is implemented communication, the source should reduce its throughput periodically. The duration of single probing process is rapidly in the case of congestion and additive increase much shorter than the time interval between two should be adopted to probe the available bandwidth in the successive probing processes so that normal video case of unload. Therefore, we use a multiplicative factor communication will not be disturbed. The following two γ to reduce the allowed bit rate and use a value λ to equations are used to calculate the allowed bit rates in the increase the allowed bit rate. In our video communication decrease and increase cases respectively. experiments, γ =0.8 and λ =2kbps are suitable values. Ba i +1 = max{( Bo + B s i ) × γ , Bmin } i (1) II - 246
  3. 3. Ba i +1 = min{Bo + B s i + λ , B max } i (2) until Ba is lower than Bb. The using of a threshold band i+1 instead of a single threshold is to avoid frequently where Ba is the allowed bit rate the application can use switching when the available bandwidth is around the in the next feedback interval, Boi represents the mean switch threshold. throughput in the time interval just past, Bs is the bit rate of redundant data in the current time interval. If the decision is increase, the bit rate of redundant data in the next time interval is computed using Eq. (3), bandwidth Ba3 B s i +1 = min{Ba i − ( Bo i + Bs i ), S max } (3) Ba2 otherwise, Bsi+1 is set to zero, where Smax is the maximum redundant bit rate. Ba1 Bmax full-color switch point video 0 bandwidth Bt t1 t2 time Bb Figure 3: Probing process. bi-level bi-level Bmin video video It should be noted that in the probing process, if a time decrease decision is encountered (see time t1 at Fig.3) for Figure 2: Switches between bi-level and full-color videos the first time, we will still use the same allowed bit rate to encode video and send redundant data. If the next decision We define a threshold band (Bb, Bt) for the switch is increase or hold (see time t2 at Fig.3) then go ahead, between bi-level and full-color videos. otherwise, the probing must be stopped immediately and As shown in Fig.2, if the video is initially in bi-level we cut the allowed bandwidth using Eq. (4). and the allowed bit rate Ba increases to the lower end of Ba i +1 = min{Bo i , ( Bo i + B s i ) × γ , B min } (4) the threshold band Bb, no switch takes place. If the bandwidth increases and reaches the higher end of the The using of the same bit rates in two successive probing threshold band Bt, the video is switched to full-color. On steps prevents the probing process from being disturbed by the contrary, if the bandwidth drops, switch does not occur some random changes of network conditions. 30 2000 available Bandwidth (Kbps) 25 bandwidth 1500 RTT (ms) 20 estimated 15 1000 bandwidth 10 video 500 bitrate 5 0 0 RTT 0 20 40 60 80 100 120 140 160 180 Time (s) Figure 4: Bandwidth estimation for bi-level video 70 2000 available Bandwidth(Kbps) 60 56Kbps 1500 bandwidth 50 RTT (ms) video 40 1000 bitrate 30 33.6Kbps actual 20 500 throughput 10 RTT 0 0 0 20 40 60 80 100 120 140 160 180 Time (s) Figure 5: Bandwidth probing for bi-level/full-color video switch. II - 247
  4. 4. 4. EXPERIMENTAL RESULTS 5. CONCLUSIONS To examine the effectiveness of our scheme on bandwidth We developed a bi-level/full-color video combination adaptation and bi-level/full-color video switch, we scheme for video communication in the whole range of established a network platform using a network bandwidth bandwidth conditions on PCs, handheld PCs and palm-size emulation tool -- Cloud 2.1. The source video is captured PCs. In this scheme, bi-level video works at below 56 at real-time in QCIF format and in a frame rate of 15 fps. Kbps range and full-color video works at above 33.6 Kbps Figure 4 shows a network environment in which the range. A bandwidth band from 33.6 Kbps to 56 Kbps is bandwidth varies from 9.6 Kbps to 24 Kbps. While the used to avoid frequently switching between bi-level and available bandwidth increases, the estimated bandwidth full-color video if users’ available bandwidth is around the and actual throughput increases. Due to the low bit rate switch range. Since the rate control scheme of bi-level characteristic of bi-level video, not all the available video coding is very different with that of DCT-based bandwidth is consumed. For example, at the time of the coding, a special bandwidth estimation algorithm is 160th second, the available bandwidth is 24 Kbps, but the implemented for bi-level video status and a bandwidth actual throughput is only around 20 Kbps even though the probing scheme is designed for bi-level/full-color video estimated value is close to the network capacity. If the switch. Experiments show that in bi-level video status, the network capability shrinks, the back-off scheme reduces generated bit rate is always fit into the available bandwidth the throughput rapidly to avoid congestion. The figure and as soon as additional bandwidths are available, the shows that for a given bandwidth, our scheme can fit the video can be automatically switched from bi-level to full- sending rate to the network capacity. color. The future direction would be to develop videos that The next experiment focuses on the examination of possess higher visual quality than that of a bi-level video bandwidth probing and switch scheme. As shown in Fig.5, but with their bit rates still lower than that of a full-color the available bandwidth is distributed at 40 Kbps, 64 Kbps video. and 28.8 Kbps respectively. To illustrate the probing process, we use a shorter probing cycle: 30 seconds. In 6. REFERENCES real scenarios, the probing cycle is usually much longer so that the probing process does not disturb normal [1] ISO/IEC JTC1/SC29/WG11 N3312 Coding of moving communication process. pictures and audio March 2000/Noordwijkerhout. Suppose the system initially works at bi-level video [2] ITU-T Recommendation H.261 Video codec for status. The bit rate of bi-level video is approximately equal audiovisual services at p x 64 kbit/s, 03/93. to the throughput. At the time of the 30th second (marked [3] ITU Telecom. Standardization Sector of ITU, quot;Video with symbol in Fig.5), the probing process starts. The coding for low bit rate communication,quot; ITU- totally sent bit rate is larger than the actual bit rate of the Recommendation Version 2, Feb. 1998 video since we fill with redundant data. At point , since [4] Jiang Li, Gang Chen, Jizheng Xu, Yong Wang, Hanning the total throughput reaches the network capacity, RTT Zhou, Keman Yu, King To Ng and Heung-Yeung Shum, increases rapidly and exceeds the threshold we specified. quot;Bi-level Video: Video Communication at Very Low Bit The probing stops immediately and no more redundant Rates,quot; ACM Multimedia Conference 2001, September 30 – data is injected into the network. After waiting for a time October 5, Ottawa, Ontario, Canada, pages 392-400. period, the probing process restarts (see point ). This [5] Jiang Li, Keman Yu, Gang Chen, Yong Wang, Hanning time the total throughput increases gradually because the Zhou, Jizheng Xu, King To Ng, Kaibo Wang, Lijie Wang available bandwidth is sufficient. The switch takes place at and Heung-Yeung Shum, quot;Portrait Video Phone,quot; ACM 56 Kbps (point ) and the probing process finishes. After Multimedia Conference 2001, September 30 – October 5, running in full-color video for a while, the system Ottawa, Ontario, Canada, pages 597-598. suddenly encounters a bandwidth drop (to 28.8 Kbps in [6] Kostas G. Anagnostakis, quot;Congestion Control in Packet point ). The system immediately switches to bi-level Switching Internetworks,quot; Dept. of Computer and video due to large loss rate. Figure 5 shows that our Information Science, University of Pennsylvania, Apr. 2001. bandwidth capability probing and bi-level/full-color video [7] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, switch scheme can make our video communication system quot;RTP: A transport protocol for real-time applications,quot; RFC work well in the whole range of given bandwidths. The PC, 1889, Jan. 1996 Pocket PC and Handheld PC versions of our video [8] I. Busse, B. Deffner, and H. Schulzrinne, quot;Dynamic QoS communication system can be freely downloaded from Control of Multimedia Applications based on RTP,quot; Computer Communications, Jan. 1996. II - 248