Flexible Access to Video Streaming
Yongdong Wu and Feng Bao
Institute for Infocomm Research
21, Heng Mui Keng Terrace, Singapore, 119613
Abstract— This paper presents a ﬂexible access method for Hartung et al.  provided an approach to watermark the
secure multicast streaming. In this scheme, a server segments a compressed/raw video and encrypt them. Any user can decrypt
video into clips, and prepares the protected video by selectively the content directed to him. The watermarking is performed
encrypting the video clips off line. Proxies store the protected
videos and distribute them as Cable-TV programs. To decrypt in different portion of an image so as to trace the owner
a protected video or clip, a client sends to the server a request efﬁciently. In their experiments, to watermark MPEG-2 bit-
for the decryption keys. The server then generates an enabling stream, 15-30% of the DCT coefﬁcients should be altered
block which enables the client to access video or clips in a period. typically. However, Hartung et al. disabled multiple access to
Therefore, the present scheme enables the client to access a clip the content without storing the content at the consumer side.
or video for a restricted number of dissemination times, but the
server selectively encrypts the video only once, and distributor Perkins et al.  proposed a scheme which enabled to ac-
performs no encryption at all. cess the content for predeﬁned times. In the scheme, a trusted
player should be installed such that the number of licenses can
I. I NTRODUCTION be decreased gradually. To defeat against license backup, an
Media streaming is becoming more and more popular due on-line license authority or registrar should be deployed. The
to the explosive growth of Internet and multimedia processing scheme also suffers from both additional computational cost
technologies and applications  - . One hot topic in and storage.
multimedia dissemination is how to prevent data theft. For The present streaming scheme partitions a video into clips
example, in pay-per-view business which assumes that the based on the video’s utilization characteristics -, and
subscriber can not save the decrypted content, the service each clip is further divided into fragments. One fraction
provider broadcasts/multicasts the video in encrypted form, of each fragment is encrypted twice with two independent
and any recipient can not decrypt the video without a valid keys and each cipher-text is merged with the unprotected
decryption key. A n¨ ive solution  is that the server/proxy
a fragment to produce two protected fragments. All the protected
unicasts the encrypted media to the user. However, due to fragments are sent to the proxies and cached in the proxies.
the limitation of network bandwidth and servers’ resource, At the transmission stage, one of two protected fragments is
neither can networks support many unicast connections simul- selected and multicast to the clients. In order to play a video, a
taneously, nor can servers/proxies encrypt an arbitrary number client must send a request to the server. The server prepares an
of videos in real-time. enabling block with the client’s public key according to the
To reduce the computational overhead, Shi et al.   client’s request. After recovering the keys for the requested
designed a fast MPEG video encryption algorithm by encrypt- clip(s) from the enabling block, the client receives the pro-
ing the sign bits of a fraction of DCT coefﬁcients to produce tected video. If the received video data is encrypted, the client
encrypted DCT coefﬁcients. This scheme decreases the com- decrypts the data and assembles the decrypted data with other
pression performance and thus increases the network trafﬁc. original data to form a decodable video stream. The present
Zeng et al.  presented a joint encryption and compression scheme provides the following properties: short response time,
framework in which video data are scrambled efﬁciently in efﬁcient network usage, light weight computation, and ﬂexible
the frequency domain by employing selective encryption of access mechanism.
the transform coefﬁcients and motion vectors. The rest of this paper is organized as follows. Section II
Griwodz et al.  proposed to protect a fraction of the com- elaborates the present secure streaming. Section III discusses
pressed video data directly rather than the DCT coefﬁcients the performance in terms of the computational cost, storage
so as to remain the compression rate. The protected video is consumption, etc. A conclusion is drawn in Section IV.
stored in the proxies and multicast periodically. After receiving
a request from a client, the server encrypts a selected fraction II. FLEXIBLE ACCESS TO VIDEO STREAMING
of codes using a personal key of the client, and unicasts In the application of video distribution, the providers wish
the encrypted codes to the client. However, the server has to to provide ﬂexible access mechanism for business proﬁt. In
encrypt the selected portion of the video on the ﬂy and involve pay-per-view business, there are at least two factors related
in the streaming all the time. In addition, the server should to the charge rate: number of clips and access time. Clips
synchronize with the client till the end of the transmission control manages which clip a client is allowed to render, while
session. time control manages when a client is allowed to render a
0-7803-8521-7/04/$20.00 (C) 2004 IEEE
Segmenting Generating Generating
B. Producing Protected Videos at the Server
video protected video enabling block
In order to protect a video from illegally playing, the video
should be encrypted with secret keys. To this end, a server
selects a random number r as a master key and generates clip
protected video Proxy keys to encrypt clips. Speciﬁcally, the process for producing
protected videos is as follows.
(a) Divide clip i into fragments as Fij = Vij ||Sij (j =
Rendering Recovering Recovering
0, 1, · · · ) based on some rules (e.g, Vij is the frame
video video video keys Client header and Sij is media data), where x y is the
concatenation of strings x and y.
(b) For each clip i, create the ﬁrst key K1i = H(r 1 i),
Fig. 1. Fundamental building blocks. In the Content Distribution Network, where H(·) is a one-way function.
a proxy may be an edge server. (c) Encrypt Vij with key K1i as V 1ij = E(K1i , Vij ),
where E(·) is an encryption function. The protected
fragment Fij = V 1ij Sij .
clip/video. To cater to these requirements, the present scheme ˜
(d) All the protected fragments Fij generated in steps (a)-
enables each client to render either one clip or video within (c) constitute the protected clip i, then all the protected
a restricted time period according to a key hash chain. Figure clips form a protected video V 1.
1 illustrates the building blocks: Segmenting video, generat- (e) For each clip i, create a second key K2i = H(r 2 i).
ing protected video, distributing protected video, generating (f) Encrypts Vij with key K2i as V 2ij = E(K2i , Vij ). The
enabling block, forward enabling block, recovering video key, ˆ
protected fragment Fij = V 2ij Sij .
recovering video and rendering video. ˆ
(g) All the protected fragments Fij generated in steps (e)-
(f) constitute a second protected clip i, and then all the
A. Segmenting Videos at the Server protected clips form a protected video V 2.
(h) V 1ij , V 2ij and Sij (i, j = 0, 1, · · · ) are sent to the
Generally speaking, a video can be segmented into clips
randomly. However, it is effective to segment a video based
on the its utilization characteristics  . With respect to In fact, videos V1 and V2 do not physically exist in the
Figure 2, the experiment  demonstrates that only 55% of proxies because they share the un-encrypted data Sij . Figure
all the requestors play the entire video, and most stoppages 3 illustrates the data structure of protected videos stored in a
occur during the ﬁrst 5% of the movie playback period. The proxy.
short starting pattern suggests to distribute the ﬁrst several
Clip 0 Clip 1
minutes frequently. Thus, a video is segmented into three fragment
clips: the beginning data, the middle body and the end data.
This segmentation method enables the proxy to distribute the V1
clips at different rates so as to provide short response time on
average. V1 00
Clip 0 Clip 1
Fig. 3. Static structure of protected videos. The dark block (e.g., V 100 ,
V 200 ) indicates the encrypted fraction of a fragment, and the gray block (e.g.,
65 S00 ) is the unencrypted fraction of a fragment shared by two fragments.
55 C. Distributing Protected Videos at the Proxy
0 20 40 60 80
One objective of the present scheme is to provide ﬂexible
access method based on the dissemination time. Hereinafter,
Fig. 2. Partial playback . only 55% of all the requestors play the entire time is measured with the video dissemination counter which
video, and most stoppages occur during the ﬁrst 5% of the movie playback represents how many times the video has been broadcast. To
this end, the server and the proxies share a key v used to
generate chain Kt = H T −t (v) = H T −t−1 (H(v)), where t is
0-7803-8521-7/04/$20.00 (C) 2004 IEEE
the counter to represent the dissemination time, and T is the his private key. Subsequently, he calculates the keys as
maximum distribution time. In practice, if t > T , the video is follows. Firstly, he calculates the keys K1i = H(r 1
free. Given the current dissemination time t, the random seed i) and K2i = H(r 2 i) for ith clip of the protected
Kt is used to produce a random bit sequence. If a random bit videos, (i = 0, 1, · · · ). Meanwhile, the client computes
is 0, a fragment in video V1 is delivered, otherwise, a fragment Kt = H t0 −t (vt0 ).
in V2 is sent out.
F. Recovering and Rendering Videos at the Client
Figure 4 summarizes the whole process of generating trans-
mitted fragments at the server and proxy. In the process, the After recovering the keys K1i and K2i for clip i, the client
clip keys (K1i and K2i ) are derived from the master key r, starts to receive the protected fragments. He calculates the
then the keys are used to encrypt a portion of a fragment (i.e., sequence with key Kt so as to determine which clip key
Vij ) to produce two cipher-texts. Sequentially, one out of two is used to decrypt the fragments. For each fragment of clip
cipher-texts is selected by a multiplier controlled by another i, if the sequence bit is 0, select key K1i to decrypt the
key v and dissemination counter t. This selected cipher-text fragment, otherwise, K2i is used. After the encrypted fraction
is assembled with the un-encrypted Sij to form a protected of a fragment is decrypted correctly, the clear-text will be
fragment to be multicast. assembled with un-encrypted fragment data together to form
a complete original fragment. The recovered video fragment
S ij can be decoded with the dedicated player.
K1 i=H (r|| 1 || i) V1 ij
III. E XPERIMENTS AND P ERFORMANCE A NALYSIS
r V ij fragment A. Experiments
v,t In the simulation, Figure 6 is generated by encrypting Figure
K2 i=H (r|| 2 || i) V2 ij
5, while Figure 7 is generated when only picture 2 is granted.
Clearly, the video is protected properly.
Fig. 4. Distribute different video without real-time encryption by selecting
one from two encrypted clips.
D. Generating Enabling Blocks at the Server
Because the video stream is encrypted, no one can correctly
render the video without the decryption key. If client j sends to Fig. 5. Original video including 3 clips. Each clip has only one picture (or
server a video request including the interested clip/video and fragment).
dissemination period, the server generates an enabling block
with the client’s public key e and sends the block to the client.
Table I lists the enabling blocks corresponding to the requests,
vt0 = H T −t0 (v), t0 is a counter which represents the access
expiry granted to the client.
TABLE I Fig. 6. The protected video whose clips are encrypted. Thus, the pictures
T HE REQUEST AND CORRESPONDING RESPONSE , WHERE E(e, x) IS AN are noise-like.
ENCRYPTION OF x.
Request Enabling block
Clip i E(e, K1i ), E(e, K2i ), E(e, vt0 )
Video E(e, r), E(e, vt0 )
E. Recovering Video Keys at the Client Fig. 7. The picture 1 remains unchanged, picture 2 and picture 3 are
decrypted with the key for picture 2. Accordingly, only picture 2 is recovered
Upon receiving the enabling block, client j recovers the correctly.
video key K1i , K2i and Kt with one of the following ways
according to the received enabling block.
• For the enabling block including E(e, K1i ), E(e, K2i ) B. Security
and E(e, vt0 ), the client decrypts the enabling block to According to one-wayness of hash chain, users have no way
K1i , K2i and vt0 with his private key. Subsequently, the to generate a valid sequence with their outdate keys. At the
client computes Kt = H t0 −t (vt0 ). mean time, no matter how many traitors collude, they can
• For the enabling block including E(e, r) and E(e, vt0 ), not generate a new sequence with their keys. However, if a
the client decrypts the enabling block to r and vt0 with client requests for the whole video once and then requests for
0-7803-8521-7/04/$20.00 (C) 2004 IEEE
one clip at the last dissemination time, he can always access further. Next, each fragment is partially encrypted with two
the whole video because he knows r and vT . In pay-per-view keys to produce two protected fragments. All of the protected
business, this weakness may be not serious. fragments are cached in the un-trusted proxies for multicasting
Another threat comes from that the random assemble the to the clients. Our solution also provides ﬂexible access control
fragments. Averagely, if we assume that the time-key sequence based on the request for video clip and time. Because the proxy
is random, an old legal user can always decrypt 50% fragments does not encrypt the video on the ﬂy, but selects one from two
with one old time-key. Given t time-keys, the attacker can encrypted clips (equivalently select one key from K1i or K2i
decrypt the video at the probability 1 − 2−t . In reality, people based on access time key Kt ), the computational cost is small
will not content with a degraded quality of movie if he had
viewed it before. Furthermore, the attacker has to ﬁnd a way
to tell the correct frame from the decrypted frames. Of course,  Z. Morley Mao, David Johnson, Oliver Spatscheck, Jacobus E. van der
Merwe, and Jia Wang, “Efﬁcient and Robust Streaming Provisioning in
a human can do it easily by checking the frames one by VPNs,” WWW2003, pp.118-127 4.
one, but this brute force method may not meet the real time  Dimitris Thanos, “COiN-Video: A Model for the Dissemination of
requirement in streaming. On the other hand, some frames Copyrighted Video Streams Over Open Networks,” 4th Information
Hiding Workshop, 2001
which are decrypted wrongly may propagate the error due to  Y. Chae, K. Guo, M. Buddhikot, S. Suri, and E. Zegura, “Silo,
the correlation between the frames. For example, a mistake Rainbow, and Caching Token: Schemes for Scalable Fault Tolerant
in an I-frame may propagate to all the related P-frames and Stream Caching,” IEEE Journal on Selected Areas in Communications,
B-frames in MPEG-2 streaming. Thus, the proposed scheme  Siu F. Yeung, John C. S. Lui, and David K. Y. Yau, “A Case for a
can meet the requirement of content distribution with a low Multi-Key Secure Video Proxy: Theory, Design, and Implementation,”
price. 10th ACM Multimedia’02, pp. 392-401, 2002.
 Changgui Shi, and Bharat Bhargava, “A Fast MPEG Video Encryption
C. Comparison Algorithm,” 6th ACM Multimedia, pp.81-88, 1998
 Changgui Shi, Sheng Yih Wang, and Bharat Bhargava, “MPEG Video
Table II lists the performance of video streaming schemes in Encryption in Real-time Using Secret Key Cryptography,” http://
terms of access control dimension, security, storage efﬁciency,  Wenjun Zeng, and Shawmin Lei,“Efﬁcient Frequency Domain Selec-
and computational overhead. Only the present solution pro- tive Scrambling of Digital Video,” IEEE Transactions on Multimedia,
vides 2-dimension access control. That is to say, all the other 5(1):118-129, 2003
 G. Perkins, and P. Bhattacharya, “An Encryption Scheme for Limited
solutions provide the function of enabling/disabling access to K-time Access to Digital Media ”, IEEE Transactions on Consumer
a video, but our solution reﬁnes the user’s requirements into Electronics, 49(1):171-176, 2003
clips and access time. The scheme in  requires unicast  C. Griwodz, O. Merkel, J. Dittmann, and R. Steinmetz, “Protecting VoD
the Easier Way,” ACM Multimedia ’98, pp. 21-28, 1998.
channels which provide fast response with a price of inefﬁcient  Frank Hartung, and Bernd Girod, “Watermarking of Uncompressed and
network usage, while others reduce the network burden by Compressed Video,” Signal Processing, 66(3):283-301, 1998
multicast channels. Considering the storage efﬁciency, our  K. Almeroth, and M. Ammar, “Multicast group behavior in the Inter-
nets multicast backbone (MBone),” IEEE Communications Magazine,
solution requires additional storage (appro. 1% of the video), 35(6):124-129, 1997.
and Shi’s solution  increases the storage space because the  S. Acharya, and B. Smith, “An Experiment to Characterize Videos
compression rate is decreased. In view of the computational Stored on the Web,” ACM/SPIE Multimedia Computing and Networking
cost, only our solution requires little real time computation,  M. Chesire, A. Wolman, G. Voelker, and H. Levy, “Measurement and
but the solution  requires intensive computational resource. Analysis of a Streaming Media Workload,” USENIX Symposium on
Additionally, the video proxy scheme given in paper  is Internet Technologies and Systems (USITS), 2001
vulnerable to collusion attack.
P ERFORMANCE COMPARISON
Griwodz Shi Yeung Present
Dimension 1D 1D 1D 2D
Secure Yes Yes No Yes
Network partially multicast unicast multicast
Data No Yes No Yes
Real-time Fast Fast Slow Fastest
IV. C ONCLUSION
In the present secure streaming scheme, a video is par-
titioned into clips. Each clip is partitioned into fragments
0-7803-8521-7/04/$20.00 (C) 2004 IEEE