IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012 413 The rounding proposed above ensures that the difference between B. Tian’s DEmax(dx ; dn ; dw ; dnw ) and min(dx; dn; dw ; dnw ) is at maximum onegraylevel. By taking a simple rounding as, for instance, d = b(pb =4)+ The basic DE version  performs one level of integer Haar wavelet(1=2)c, one may have a difference between d and pb 0 3d of up to three transform on pairs of pixels of the cover image. As long as overﬂow or underﬂow is not generated, some high-frequency (HF) coefﬁcientsgraylevels. are selected and shifted to the left to free the LSBs in order to embed The proposed transform is reversible. Let us suppose that the trans- 1 bit of data/coefﬁcient. The marked pixels are obtained by inverseformed pixels, i.e., X , N , W , and N W , are not subject to overﬂow 0or underﬂow, i.e., for 8-bit images, X , N , W , N W 255 . At transform of the low-frequency (LF) coefﬁcients and of the modiﬁed HF ones. The HF coefﬁcients of the integer Haar transform are exactlydetection, by using (1), one gets the following for the transformed con- the differences between pairs of adjacent pixels. The LF coefﬁcients aretext: the truncated semisum of pixels. The shifting to the left by one position ^ X = N + W 0 N W = x 0 dn 0 dw 0 dnw : ^ (6) is a simple multiplication by 2. Tian’s DE embeds 1 bit into a pixel pair by increasing two times the difference between adjacent pixels. ^The prediction error for the transformed context is P = X 0 X = In order to compare the distortion introduced by Tian’s DE with thex 0 x + dx + dn + dw + dnw = 2p + b. Embedded data b follows as ^ one introduced by the proposed transform, a single pair out of a four-the least signiﬁcant bit (LSB) of X 0 X^ pixel block should be transformed. Let us consider the pair w , x. The ^ X0X ^ corresponding DE transform is b = (X 0 X ) 0 2 w+x 2(w 0 x) + 1 + b 2 (7) W 2 + = 2 (11) w+x 2(w 0 x) + b : X=and p is recovered as ^ X 0X 0b 2 0 2 (12) p= : 2 (8) Before going any further, let us consider pixel w and let us predictThen, dx , dn , dw , and dnw are computed with (4). Finally, (5) is in- ^= its value as w x. This is the simplest linear prediction scheme (used also in JPEG-LS as the ﬁrst predictor). The prediction error is p =verted, and the original pixels are recovered, i.e., w 0 x, and next, with the insertion of a data bit, pb w 0 x b. = + = = + = + = x X 0 dx ; n N dn ; w W dw ; nw N W 0 dnw : (9) The basic prediction-error expansion scheme would transform pixel w as W 0 = + w pb and would let x unchanged. Let us next consider Let us next compare the performance of the transform deﬁned above the approach introduced above for the simple JPEG1 predictor. Thewith the classical transforms used in reversible watermarking. Since optimized embedding scheme follows by subtracting a fraction of pb ,the predictor of (1) is also used in the MED predictor, let us ﬁrst in- d, from the context pixel x. One has X 0 http://ieeexploreprojects.blogspot.com = x 0 d and, consequently, = +vestigate the case of MED-based prediction-error expansion reversible W 0 w pb 0 d. The embedding square error is E 2 pb 0 d 2 =( )+watermarking. d 2 =2 2 2 2 + d 0 pb d pb . Minimum error Emin 2 = pb = is obtained 2 for d = 2 2 pb = . Let the integer part of d be bpb = c. By taking intoA. MED Transform account that pb 0 bpb = c bpb 2 = +1 2 = c, the optimized prediction-error We remind that MED is a high-performance predictor used in expansion transform is ^JPEG-LS standard . With MED, i.e., x, the estimate of pixel x is w0x b +1+ w0x b + min( ) n; w ; if nw max( ) n; w W0 w = + 2 ; X0 x 0 = : (13) 2 x = max(n; w ); ^ if nw min(n; w ) (10) n + w 0 nw; The transforms deﬁned by (11)–(13) are almost equivalent. For in- otherwise. = + ( +1+ ) 2 = +( +1+ ) 2 = stance, W 0 w b w 0 x b = c bw w 0 x b= cThe predictor tends to select n in cases where a vertical edge exists b(w + x)=2 + (2(w 0 x) + 1 + b)=2c b(w + x)=2c + b(2(w 0left to the current location, w in cases of a horizontal edge above x, x) + 1 + b)=2c = W . Similarly, X X . The error is of maximum 0 +or n w 0 nw if no edge is detected. The same predictor was also one graylevel. More precisely, the transforms are identical when x andregarded as the median of three simple linear predictors, i.e., n, w , and w are both odd or both even numbers. If x and w do not have the same +n w 0 nw . The original prediction-error-expansion-based re- parity, one of the transformed pixels is identical, whereas, for the other,versible watermarking scheme  was built around MED. Many other there is a difference of one graylevel.PE schemes use a MED predictor as well. Therefore, Tian’s DE transform can be approached as an optimized For a single pixel embedding, without optimization, the insertion by prediction-error expansion reversible watermarking scheme. Since a 2using the predictor of (1) introduces a square error of pb . With the op- pair of adjacent pixels is shown as a pixel and its estimate, the in- 2 4timized scheme of (5), the distortion decreases at pb = . Let us suppose terpretation of the basic DE scheme as a prediction-error expansionthat a MED-based reversible watermarking scheme is used. If the pre- scheme is immediate. The novelty is the optimization of the predic-dictor ﬁnds no edge (min( ) n; w nw max( )) n; w , it selects the tion-error expansion schemes, which provides the almost exact Tian’sJPEG4 predictor. In this case, the gain of using the optimized version DE transform.is very signiﬁcant. If MED selects the two other predictors, one cannot Finally, let us analyze the embedding errors introduced by the twoevaluate a priori which scheme gives better results. schemes. By neglecting the embedded data bit and the rounding, the Most of the embedded pixels are selected from rather uniform re- error introduced by the original DE scheme is approximately = (w 0 x )gions. One can suppose that for such regions, the prediction errors pro- 2 2 ET :vided by the three linear predictors of MED are not signiﬁcantly dif-ferent. From (3) it follows that the distortion introduced by embedding 2 (14)into the current pixel of the classical MED scheme is larger than the one Similarly, the error introduced by the optimized JPEG4 prediction-introduced by embedding into the pixel and its context of the JPEG4 error expansion scheme is = (x 0 n 04w + nw)optimized scheme, provided that the prediction error given by MED is 2 2not less than half of the one given by JPEG4. Emin : (15)
414 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012One has 2 ^ Emin = ([(x 0 w) + (nw 0 w)]2 =4) ((x 0 w)2 =4) + X = W + NW 0 N = w + nw 0 n + dw + dnw 0 dn . Since dx ((n 0 nw) =4). One can consider that the two pairs in the 2 2 2 block dw + dnw 0 dn , the new prediction is not signiﬁcantly disturbed by the 2have a rather similar behavior, i.e., (x 0 w )2 (n 0 nw )2 and, conse- ﬁrst stage of embedding. From (4), it appears that dx = dw + dnw 0 dnquently, Emin ET . To conclude, the proposed transform is expected when pb is even. If pb is odd, the difference is one graylevel. Let pb be 2 2 0to provide a lower distortion than the DE one. the prediction error plus the bit to be embedded in a 2 2 2 block in the second watermarking stage. Let dx , dn , dw , and dnw be the cor- 0 0 0 0 III. TOWARDS REVERSIBLE WATERMARKING SCHEMES responding quarters. The error after the second stage of watermarking is E 2 = (dx + dx )2 + (dn 0 dn )2 + (dw + dw )2 + (dnw 0 dnw )2 . 0 0 0 0 The proposed embedding scheme alters not only the current pixel but If the prediction of (16) is close to the one of (1) and this should bealso its context. The pixel and its context covers a 2 2 2 image block. true at least for uniform regions, one has pb pb and, consequently, 0If the image is partitioned in disjoint 2 2 2 blocks and the transform dx dx , dn dn , and so on. The error becomes 0 0is applied on each block, the upper bound of the provided embedding E 2 4dx + 4dw = p2b : 2capacity is 0.25 bpp. Furthermore, location map or histogram shifting 2 2 (18)schemes can be immediately developed. In order to increase the embedding capacity, there are two commonsolutions, i.e., multiple embedding or transforming over a denser parti- The cost of embedding a second bit doubles the embedding error. Bytion. If directly applied, both solutions fail, i.e., the lower distortion of chaining two watermarking stages, the theoretical embedding capacitythe proposed transform is lost. For example, let us apply a second trans- is bounded by 0.5 bpp. ^form over a 2 2 2 block. The new estimate, i.e., X , is given by (6). The A third stage can be also added without disturbing the prediction. Innew prediction error follows as P = 2p + b, i.e., two times greater than order to cancel the effect of the two previous embedding stages, a newin the ﬁrst embedding stage. A quarter of the prediction error together predictor should be used, i.e.,with the message data bit is then added to/subtracted from the currentpixel and its prediction context. A quarter of the new prediction error is x = n + nw 0 w: ^ (19)half of the initial prediction error. Since the distortion in the ﬁrst stageis about one quarter of the prediction error, the embedding of two bits By considering, as above, a rather similar prediction error provided byper block introduces a distortion of about 3=4p. The direct double em- (19), the distortion per block introduced by embedding a third data bitbedding appears to be less interesting than applying Tian’s transform appears to be E 3pb =4. 2 2on the two pairs of a block. For Tian’s DE, at the same capacity, the Let us next investigate the watermarking for nondisjoint blocks. Letdistortion per pixel is half of the prediction error. x and its context be embedded by using the predictor of (1), and let e For a dense partition embedding, after the embedding of the ﬁrst be predicted by using (16) (see Fig. 1). One has e = N + X 0 ne = ^block, e should be predicted by using X , Nhttp://ieeexploreprojects.blogspot.com + n 0 ne)+(dx 0 dn ). Since jdx 0 dn j 1, , and ne (see Fig. 1). One n 0 dn + x + dx 0 nw = (xhas e = X + ne 0 N x + ne 0 n + pb =2. Because of the use the effect of the embedding into x and its context on the prediction of ^of modiﬁed pixels in the prediction context, half of the previous block e with the second predictor is negligible. For jpb j 1, the effect of theprediction error is added to the new prediction error. Furthermore, half embedding of the previous block is canceled by replacing (4) with theof the already increased prediction error of the estimation of e accu- following:mulates to the prediction error for the estimation of z and so on. Let usassume that, without embedding, the prediction errors for a sequence dx = dn = pb ; dw = pb 022 x +1 ; dnw = pb 02dx : (20) 4 d 2of adjacent pixels take rather similar values. In the worst case (all theprediction errors have the same sign), the prediction error for embed-ding a sequence of pixels for a dense partition embedding increases up Thus, the value estimated by using (16) is exactly the one obtained forto p=2 + p=4 + p=8 + . . . + p. The greater the prediction error, the the original context. The optimized embedding into e and its context gives X 0 = X 0 As it appears from the above analysis, the prediction error increases dx = x + dx 00dx , N = N 0 dn = n 0 dn 0 dn , NE = ne + dne ,greater the distortion introduced by the embedding. 0 0 0 0 0 0if the full prediction context or a part of it has already been embedded. and E = e + de . By using the predictor of (16) and its correspondingWe shall investigate next a modiﬁed transform that, used together with embedding procedure, it appears that e and ne are both modiﬁed bythe previously deﬁned one, does not disturb the prediction. The basic the addition of a quarter of the current prediction error. If the schemeidea is to consider a predictor that eliminates the effect of context em- continues with the prediction of z by using (1), the effect of the ad-bedding. One can afford to take even a less efﬁcient predictor, as long ditive embedding into e and ne is eliminated. Furthermore, if the em-as its prediction error is less than the one provided by JPEG4 on an bedding is performed similarly to the one described by (20), namely,already embedded context. Let us consider a slightly modiﬁed version by taking equal parts for e and ne, the prediction of z follows withoutof (1), namely any distortion. The alternate use of transforms derived by using predictors (1) and x = w + nw 0 n: ^ (16) (16) and their corresponding embedding procedures allows the embed- ding into successive pixels on rows without worsening the predictionAs above, the prediction error and the data bit to be hidden, i.e., pb = error. Since this scheme allows the embedding of one out of two rows,x 0 x + b = p + b, is embedded into the current pixel and its prediction the provided capacity is bounded by 0.5 bpp. ^context. One has Let pb and pb be two consecutive prediction errors (together with the 0 message bits). For simplicity reasons, let us consider that the amount X = x + dx ; N = n + dn ; W = w 0 dw ; NW = nw 0 dnw (17) to be added or subtracted0 from0 each current pixel and its prediction context is d = pb =4 and d = pb =4, respectively. If the ﬁrst pixel waswhere, in order to minimize the square error introduced by the embed- predicted with (1), the effect of the double embedding on the commonding, dx , dn , dw , and dnw are computed with (4). pixels is as follows: d 0 d0 is added to one pixel, and d + d0 is subtracted Let us ﬁrst investigate the case of a second watermarking stage for from the other one. If the prediction starts with (16), the effect is ratherdisjoint 2 2 2 blocks. With (16), the estimate of X = x + dx is similar, namely, d+d0 is added and not subtracted. The error introduced
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012 415 Fig. 3. Test images: Lena and Mandrill. transform, and GAP prediction-error-based transform. We remind that the GAP predictor is more complex than the MED one. The predic- tion context is extended from 3 to 7 pixels. Not only the existence of a horizontal/vertical edge is detected but also its strength (weak, normal, or sharp). The detection is based on local gradients and experimen- tally determined thresholds. The results obtained for GAP error expan-Fig. 2. (Top) Original, (left) watermarking on consecutive pixels with two al- sion reversible watermarking clearly outperform the ones obtained forternate predictors, (right) watermarking on consecutive pixels with a single pre- MED-based schemes , .dictor, and (bottom) corresponding absolute embedding errors. In order to eliminate the effects of any particular reversible water- marking implementation, we present the distortion introduced by theby double embedding in a pair of pixels is E 2 = (d+ d0 )2 +(d0 d0 )2 = bedded pixels. No other implementation details are taken into account transforms with respect to the capacity estimated by the number of em-2d + 2d . Furthermore, if one considers jpb j jpbj, the error is 2 02 0 as, for instance, location map, overﬂow/underﬂow maps, and sequences of ﬂag bits. As discussed in Section III, the proposed transform modi- E2 4 pb : 2 ﬁes the prediction context. This effect is eliminated by investigating the (21) transform on disjoint blocks of size 2 2 2. The other transforms haveAs said above, jpb j jp0 j can hold at least in rather uniform areas. The quarter of the image pixels. been also evaluated in the same conditions, i.e., for embedding only a bresult given by (21) is half of the one of (18). Furthermore, the distor- The experimental results obtained on the two test images are plotted http://ieeexploreprojects.blogspot.comtion obtained for dense embedding, evaluated in favorable conditions, in Fig. 4. The solid line represents the results obtained for the proposedappears to be of the same size as the one obtained for single embedding transform, the dotted line represents the ones for the GAP-based pre-into disjoint blocks. diction-error expansion, the dash-dotted line plots the Tian’s DE re- An example of watermarking for nondisjoint blocks with two alter- sults, and ﬁnally, the dashed line are the ones for MED-based predic-nate predictors is presented in Fig. 2. The 4 bits to be embedded are tion-error expansion. The proposed JPEG4 predictor-based transform1, 0, 1, and 0. The original 2 2 5 image is shown in Fig. 2. The left outperforms the other transforms. The improvement is greater on Man-column presents the watermarking, pixel by pixel. The current pixel drill than on Lena.in each window is displayed in bold. In the ﬁrst processing window,the prediction result obtained by using (1) is x = 71. One gets the images, the prediction does not provide very good results. In fact, re- ^ The test image Mandrill is mainly composed of texture. On suchprediction error p = 02 (x = 69) and the extended prediction error gardless of the complexity of the predictor, the other transforms (twopb = 01. Since jpb j = 1, (4) is used and only the current pixel changes predictor-based transforms and the Tian’s DE, which can be also inter-from 69 to 68 (dx = 01). The second window is computed by using preted as a prediction-error expansion transform) give rather similar re-(16). One gets e = 85, p = 018, pb = 018. Since jpb j 1, by sults on Mandrill. The JPEG4 predictor, without optimized embedding, ^using (20), one gets de = dne = 05 and dx = dn = 04 and, con- gives almost the same results as the MED predictor (the two curvessequently, E = 62, NE = 54, X = 73, and N = 80. The third are almost indistinguishable). On the other hand, JPEG4 with the pro-window estimates z by using (1) and so on. The right column presents posed optimized embedding provides a signiﬁcant improvement. Asthe embedding into consecutive pixels by using only the predictor of said above, because of the textured content, a rather large prediction(1). The watermarking with two predictors introduces less distortion error is expected on Mandrill. The optimized embedding drasticallythan the watermarking with a single predictor. For instance, the square reduces the embedding error (to about a quarter).error corresponding to the marking with two predictors is 332, and the On Lena, the performance of the predictor does count. The trans-one for the scheme with a single predictor is 626. form based on MED provides the poorest performances. The Tian’s DE appears to provide rather similar results with the transform based IV. EXPERIMENTAL RESULTS on the GAP predictor. In fact, in real applications, GAP-based predic- Next, we present experimental results on the proposed transform on tion-error expansion outperforms Tian’s DE (Tian’s DE operates ontwo classical graylevel test images of size 512 2 512, i.e., Lena and disjoint pairs of pixels, whereas the GAP-based PE does not have suchMandrill (see Fig. 3). The selected images have different statistics. constraints and can insert data into each pixel). As for the case of Man-Thus, Mandrill mainly contains texture, and Lena combines large uni- drill, the proposed transform provides the lowest distortion.form areas with texture. Before going any further, two comments should be made. The results in Fig. 4 do not take into account the size of the overhead. Moreover,A. Transform Evaluation the results are obtained for transforming only a quarter of the image The ﬁrst experiment evaluates the quality of the proposed transform. pixels. For real implementations, the performance also depends on theThe experimental results on the two test images are compared with the embedding density of each transform. Obviously, the results in Fig. 4ones obtained for MED prediction-error-based transform, Tian’s DE are better than the ones obtained in real implementations.
416 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012 Fig. 5. Experimental results of Lena and Mandrill for location map implemen- tation of the proposed transform (JPEG4O-LM), dense version with two predic- tors (JPEG4O2-LM), Tian’s DE (TIAN - DE-LM), and MED-based prediction error expansion (MED-PE-LM).Fig. 4. Evaluation of Lena and Mandrill for (solid line) proposed JPEG4 trans-form (JPEG4O), (dotted line) GAP-based prediction error expansion (GAP-PE),(dash-dotted line) Tian’s DE (TIAN-DE), and (dashed line) MED-based predic-tion error expansion (MED-PE). http://ieeexploreprojects.blogspot.comB. Reversible Watermarking Schemes In the sequel, the quality of watermarking schemes based on the pro-posed transform is evaluated. A basic location map implementation isﬁrst considered. The results obtained for the proposed transform, i.e.,peak signal-to-noise ratio (PSNR) versus capacity, are compared withthe ones obtained for MED-based prediction-error expansion reversiblewatermarking and Tian’s DE scheme. For each scheme, the locationmap is compressed by using JBIG-I. The results obtained for basic location map implementations onthe same test images are presented in Fig. 5. The results for theproposed transform, i.e., JPEG4O-LM, are plotted with a solid line.The ones for the nondisjoint blocks version with two predictors, i.e.,JPEGO2-LM, are plotted with a dashed line. The results for Tian’s DE(TIAN-DE-LM) and MED prediction-error expansion (MED-PE-LM)are plotted with dotted and dash-dotted lines, respectively. In Fig. 5, it appears that the location map implementation of theproposed transform signiﬁcantly outperforms the MED-based predic-tion-error expansion scheme. The prediction error histograms for both Fig. 6. Prediction error histogram of Lena and Mandrill for JPEG4 and MEDpredictors computed on the two test images are plotted in Fig. 6. For predictors.Lena, the prediction error histogram provided by MED is sharper thanthe one provided by the JPEG4 predictor. On the other hand, the dis-tortion introduced by the classical embedding is larger than the one in- also outperforms Tian’s DE on the entire range of capacities. As it istroduced by the proposed transform. Thus, even if the prediction error shown in Fig. 6, the high-capacity implementation still outperforms theis smaller, the overall performance of the classical scheme based on MED-based location map implementation up to about 0.4 bpp on LenaMED is worse than the one of the proposed scheme. Compared with and up to about 0.3 bpp on Mandrill.the results obtained for the MED scheme, the improvement obtained For the histogram shifting implementation, the proposed transformby using the proposed scheme is slightly larger on Mandrill. This is gives very good results only at very low capacity, namely, at less thandue to the fact that on Mandrill, MED and JPEG4 give almost similar 0.1 bpp (see Fig. 7). At such bit rates, the standard MED-based his-results (see Fig. 5), whereas on Lena, MED outperforms JPEG4. togram shifting implementation fails to provide results. Furthermore, The results for the location map implementation of the proposed if the embedding is performed only into a part of the disjoint 2 2 2transform outperform the ones obtained for Tian’s DE scheme. The blocks, the PSNR still increases (at the cost of a decrease in capacity).high-capacity implementation with the alternate use of (1) and (16) For instance, PSNRs greater than 70 dB are obtained for both images
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012 417 been obtained by considering a simple linear predictor, i.e., JPEG4, to- gether with an optimized embedding procedure. The proposed transform introduces lower distortions than the ones based on high-performance predictors such as MED and GAP. The proposed method outperforms a representative prior art of Tian’s DE transform, and it is shown that Tian’s transform is equivalent to a pre- diction-error expansion of a simple linear predictor with improved em- bedding. Practical reversible watermarking algorithms based on the proposed transform have been investigated. The appropriate application areas in- clude low bit-rate image annotation for captioning and labeling. REFERENCES  J. Tian, “Reversible data embedding using a difference expansion,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 8, pp. 890–896, Aug. 2003.  A. M. Alattar, “Reversible watermark using the difference expansion of a generalized integer transform,” IEEE Trans. Image Process., vol. 13, no. 8, pp. 1147–1156, Aug. 2004.  L. Kamstra and H. J. A. M. Heijmans, “Reversible data embedding into images using wavelet techniques and sorting,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2082–2090, Dec. 2005.  D. Coltuc and J.-M. Chassery, “Very fast watermarking by reversible contrast mapping,” IEEE Signal Process. Lett., vol. 14, no. 4, pp.Fig. 7. Experimental results of Lena and Mandrill for histogram shifting im- 255–258, Apr. 2007.plementation of the proposed transform (JPEG4O-HS) and the MED-based pre-  D. M. Thodi and J. J. Rodriguez, “Expansion embedding techniquesdiction error expansion (MED-PE-HS). for reversible watermarking,” IEEE Trans. Image Process., vol. 16, no. 3, pp. 721–730, Mar. 2007.  V. Sachnev, H. J. Kim, J. Nam, S. Suresh, and Y. Q. Shi, “Reversible watermarking algorithm using sorting and prediction,” IEEE Trans.by marking only 3% of the image blocks. Rather similar results with http://ieeexploreprojects.blogspot.com J. Li,vol. 19, no. 7, pp. 989–999, Jul. 2009. Circuits Syst. Video Technol.,the standard MED-HS are obtained in a range 0.1–0.15 bpp.  Y. Hu, H.-K. Lee, and “DE-based reversible data hiding with improved overﬂow location map,” IEEE Trans. Circuits Syst. Video The size of the additional information is smaller for the histogram Technol., vol. 19, no. 2, pp. 250–260, Feb. 2009.shifting than for the location-map-based scheme. For the same capacity,  W. Hong, T.-S. Chen, Y.-P. Chang, and C.-W. Shiu, “A high ca-more pixels should be embedded for a location map implementation pacity reversible data hiding scheme using orthogonal projection andthan for a histogram shifting one. The not embedded pixels are left un- prediction error modiﬁcation,” Signal Process., vol. 90, no. 11, pp.changed for a location map implementation, but they should be modi- 2911–2922, Nov. 2010.  M. Chen, Z. Chen, X. Zeng, and Z. Xiong, “Reversible image water-ﬁed for a histogram shifting one. Let the embedding capacity be con- marking based on full context prediction,” in Proc. ICIP, 2009, pp.trolled by threshold T . In order to be identiﬁed at detection, the not em- 4253–4256. 0 0bedded pixels are shifted with T or (T 1) (depending on the sign of  H.-W. Tseng and C.-P. Hsieh, “Prediction-based reversible datathe prediction error). Furthermore, the proposed transform-based im- hiding,” Inf. Sci., vol. 179, no. 14, pp. 2460–2469, Jun. 2009.  M. Chen, Z. Chen, X. Zeng, and Z. Xiong, “Reversible data hidingplementation can embed at maximum a quarter of the image pixels op- using additive prediction-error expansion,” in Proc. 11th ACM Work-posite to the MED-based one, which can theoretically embed the en- shop Multimedia Security, 2009, pp. 19–24.tire image. Since there are more embedded pixels, the MED-HS imple-  M. Fallahpour, “Reversible image data hiding based on gradientmentation can provide the same capacity for a lower threshold than the adjusted prediction,” IEICE Electron. Express, vol. 5, no. 20, pp.JPEG4O-HS scheme. If the threshold required by the proposed trans- 870–876, 2008.  H.-C. Wu, C.-C. Lee, C.-S. Tsai, Y.-P. Chu, and H.-R. Chen, “A highform is four times greater than the one required by MED, both trans- capacity reversible data hiding scheme with edge prediction and differ-forms introduce a rather similar distortion. As soon as the threshold ence expansion,” J. Syst. Softw., vol. 82, no. 12, pp. 1966–1973, Dec.of the proposed transform becomes greater than four times the one 2009.of MED, the distortion introduced by the proposed scheme becomes  D. Coltuc, “Improved embedding for prediction based reversible wa- termarking,” IEEE Trans. Inf. Forensics Security, vol. 6, no. 3, 2011.greater than the one introduced by MED.  H.-M. Tsai and L.-W. Chang, “Adaptive multilayer reversible data To conclude, at low capacity, the implementation based on the hiding using the mean-to-pixel difference modiﬁcation,” in Proc. IEEEproposed transforms can replace their classical counterparts based Int. Conf. Multimedia Expo, 2007, pp. 2102–2105.on MED. The proposed transform can be used in some annotation  C.-C. Lee, H.-C. Wu, C.-S. Tsai, and Y.-P. Chu, “Adaptive losslessapplications as, for instance, image captioning or labeling. Many steganographic scheme with centralized difference expansion,” Pattern Recognit., vol. 41, no. 6, pp. 2097–2106, Jun. 2008.annotation applications demand the embedding of tens to hundreds 2  M. Weinberger, G. Seroussi, and G. Sapiro, “The LOCO-I losslessof bytes (for 512 512 graylevel images, 0.1 bpp represents about image compression algorithm: Principles and standardization into320 B). JPEG-LS,” IEEE Trans. Image Process., vol. 9, no. 8, pp. 1309–1324, Aug. 2000.  S. Martucci, “Reversible compression of HDTV images using median. Adaptive prediction and arithmetic coding,” in Proc. IEEE Int. Symp. V. CONCLUSION Circuits Syst., 1990, pp. 1310–1313.  X. Wu and N. Memon, “Context-based, adaptive, lossless image A very low distortion transform for prediction-error expansion re- coding,” IEEE Trans. Commun., vol. 45, no. 4, pp. 437–444, Apr.versible watermarking has been proposed. The proposed transform has 1997.