Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Canadian Conference on Computer and Robot Vision                   Automatic object extraction in images using embedded la...
direct human intervention during the segmentation. This,              for each pixel gray-level. Also, the embedded waterm...
Semi-automatic                         Segmentation                                         Contrast                      ...
watermark embedding is embedded to the average bright-                  watermarked and possibly modified image data y as f...
where we set β = 1. Also, we set σ = 1 in (8). x∗ in      s               [3] D.J. Chaplin, Chroma key method and apparatu...
(a)                 (b)                (c)                     (d)Figure 4. Results with JPEG compressions: (a) Original i...
Upcoming SlideShare
Loading in …5
×

image segmentation

555 views

Published on

  • Be the first to comment

  • Be the first to like this

image segmentation

  1. 1. Canadian Conference on Computer and Robot Vision Automatic object extraction in images using embedded labels Chee Sun Won Dept of Electronic Eng, Dongguk University Seoul, 100-715, South Korea cswon@dongguk.edu Abstract To automatically generate images with the same fore- ground but different backgrounds, a watermark bit (e.g., binary 1 for foreground and 0 for background) can be in- serted for each pixel location. Then, the embedded water- mark bit can be automatically extracted and the background can be separated from the object. Note that the object ex- traction can be done successfully only if the watermarked image is intact. However, if the watermarked image goes through some post-processing including JPEG compression and cropping, then the pixel-wise watermark decoding may fail. To overcome this problem, in this paper, a block-wise watermark insertion and a block-wise MAP (maximum a posteriori) watermark decoding are proposed. Experimen- tal results show that the proposed method is more robust Figure 1. Example of the same object (face) that the pixel-wise decoding for various post-processing at- with different background (Images from Cal- tacks. tech Image Archive [2].) 1. Introduction with a uniform background so that the background can be easily segmented. However, this technique requires a pro- This paper deals with the problem of automatically sep- fessional studio with a physical blue screen. To avoid the arating object/foreground from background in an image. cumbersome physical setting, in [4], they propose to pre- Specifically, we focus on the problem of automatic back- shoot the background image. Then, the background im- ground replacement in an image. Note that the demands for age is subtracted from another shoot with the same back- various digital photo editing functionalities will be higher ground including an object. Also, in [5], self-identifying as the performance of digital multimedia devices such as patterns are used to recognize the background. The sec- digital camera and cell phone increases. Also, it is often re- ond approach basically relies on estimating the probability quired to generate training images with the same object but of each pixel to decide whether it belongs to the object or different backgrounds for automatic machine learning sys- the background. For example, a Bayesian framework has tem. For example, suppose we need some training images been adopted for separating the background from the object for the face recognition problem (e.g., see Fig 1). Then, to [6][7]. In particular, in [7], special images called low-depth- increase the recognition accuracy, it is important to provide of-field (LDOF) images are used for the automatic object as many training images as possible with various variations segmentation. For this particular LDOF image, object in the in the appearance of the object in the image [1]. image is focused and the rest of the background is blurred. Previous background extraction methods are based on Then, the high frequency components residing inside the three approaches. First approach is to exploit a known or object region are modeled with a probabilistic distribution controlled environment. For example, blue-screen matting for Bayesian image segmentation framework. Finally, the (also called chroma-key technique) [3] is to take a picture third approach for the background extraction involves the978-0-7695-3153-3/08 $25.00 © 2008 IEEE 231DOI 10.1109/CRV.2008.10
  2. 2. direct human intervention during the segmentation. This, for each pixel gray-level. Also, the embedded watermarksso called semi-automatic image segmentation, requires a can be viewed as two dimensional Markov random fieldstime-consuming and inconvenient user intervention to get a (MRF), then the watermark extraction can be reformulatedrough outline of the object boundary. In this semi-automatic as an MAP (maximum a posteriori) based image segmenta-image segmentation problem, the issue is how to minimize tion problem.and simplify the human’s intervention [8][9]. Note that the upper-mentioned background extraction 2. Pixel-wise QIM watermarking: Previousmethods belong to one-time solution. That is, whenever methodwe need to extract or to replace the background from thesame image, we have to repeatedly apply one of the above Quantization Index Modulation (QIM) watermarkingthree approaches, which is either semi-automatic or appli- scheme [11] was used to embed watermark bit for indicatingcable only to a limited case such as LDOF images. To solve background and object for the purpose of repetitive back-this problem, once the object and the background are sep- ground extraction and replacement [10]. The QIM methodarated by one of the above three approaches, the extracted is a blind watermarking scheme and is known to be moreobject and the background are slightly modified to embed robust than the spread spectrum watermarking. Also, thea watermark. Then, it can be automatically identified for QIM method is a spatial domain watermarking. Note thatthe later demands for background extractions or replace- since the watermark embedding should be done for irregu-ments. This paper deals with this problem. That is, we are lar shaped object and background, it is not tractable to applyinterested in the repetitive requests for the background re- transform domain watermarking scheme on a rectangularplacement instead of non-repetitive one-time execution. Of region.course, the prerequisite for our automatic background re- In the previous work [10], one-bit watermark is embed-placement is that the very first object extraction should be ded to each pixel, i.e., watermark bit 1 is embedded to adone by one of the above three approaches. Once we obtain pixel belonging to the object and watermark bit 0 to a pixelseparated object and background in image, we can embed belonging to the background. Specifically, for a pixel i, adifferent binary bit for object and background to be used watermark bit bi is generated as followsfor automatic background replacement. Then, after the bi-nary bit embedding, we can automatically separate the ob- 1, if pixel i belongs to the object/foreground bi =ject from the background by simply extracting and identify- 0, if pixel i belongs to the background.ing the embedded bits. Thus, the embedded bits for the ob- (1)ject and the background are inherited to subsequent image According to the watermark bit bi , the gray-level yi at pixelcomposition and are used for later request for the separation i is modified to yi as follows ˆof the background from the object. Q1 (yi ), if bi = 1 In [10], the quantization index modulation (QIM) based yi = ˆ (2) Q0 (yi ), if bi = 0,watermarking scheme [11] was used for each pixel to em-bed watermark bit 1 and 0 for the object and the back- where Q1 (yi ) and Q0 (yi ) are defined asground, respectively (see Fig. 2 for the overall struc-ture). Once the watermark embedded image, which is called yi − ∆/4 ∆ Q1 (yi ) = ×∆+ , (3)automatic-object-extractible image, is generated, the auto- ∆ 4matic background replacement problem is equivalent to the andextraction of the embedded watermark. As demonstrated yi + ∆/4 ∆in [10], the separation of the object from the background Q0 (yi ) = ×∆− . (4) ∆ 4can be done automatically. However, if the watermark em- In (3) and (4), ζ is to convert ζ to an integer (i.e., a ceilbedded (i.e., automatic-object-extractible) image has under- function) and ∆ is a quantization step for the watermarkgone post-processing such as JPEG compression or corrup- strength.tion by additive noise etc., the watermarked pixel values For a given watermarked image data yi , the embedded ˆwill be changed and there will be no guarantee to success- watermark bit at pixel i can be extracted as followsfully separate the object from the background. In this paper,we propose a watermark embedding and extraction method, 1, if yi − Q1 (ˆi ) < yi − Q0 (ˆi ) ˆ y ˆ y ˆi = b , (5)which is robust to some unintentional image modifications 0, otherwise.such as JPEG compressions, additive noises, and croppingetc. The basic idea of the proposed method is to embed Note that if there is no alteration for the watermarked gray-the watermark bit into the average value of a small image level yi , then the extracted watermark bit by (5), ˆi , should ˆ bblock instead of the modification of the least significant bits be identical to the original watermark bit bi . However, if the 232
  3. 3. Semi-automatic Segmentation Contrast Human Intervention Enhancement Face composition + Original Image Face extractible Image (automatic - object-extractible image) Contrast Suppression Background Figure 2. The overall structure of embedding watermark bit for the object and the background.watermarked gray-level is changed due to some post pro-cessing such as JPEG compression and noise addition, thenthe extracted watermark bit ˆi does not necessarily match bthe original watermark bit bi and the object extraction fromthe background yields some segmentation errors. For exam-ple, the watermarked image Fig. 3-(a) has undergone JPEGcompression as shown in Fig. 3-(b). As a result the water-mark extraction by (5) yields a lot of segmentation errors(see Fig. 3-(c)). (a) (b)3. Block-wise MAP QIM watermarking As demonstrated in Fig 3, the pixel-wise watermark de-coding used in [10] is vulnerable to post-processing appliedto watermarked images. To alleviate this problem, in thispaper, we propose to adopt a block-wise MAP (maximum aposteriori) decoding scheme. Note that, in [12], a pixel-wiseMAP decoding for QIM watermarking was also used. Thisscheme is based on a sliding window to embed the water-mark to the local average value. It is certainly expected thatthe robustness (especially against the JPEG compressions) (c) (d)increases by embedding the watermark to the average val-ues rather than pixel gray-levels. However, overlapping thesliding window, the current watermarked pixel value may Figure 3. Example of segmentation errors: (a)affect the watermarking to the next sliding window. Exper- watermarked image, (b) JPEG compressedimental results reveal that the visual degradations occurred and decompressed image, (c) object extrac-by modifying the current gray-level of a pixel to change the tion by (5), (d) original object.local average brightness with the watermarked neighboringpixel values. To overcome this problem, in this paper, the 233
  4. 4. watermark embedding is embedded to the average bright- watermarked and possibly modified image data y as follows ¯ness for each non-overlapping block, spreading the modi-fications to all pixels in the block rather than to a pixel in x∗ = ¯ argmax P (X = x|Y = y ) ¯ xa sliding window method. Also, by simultaneously mod- = ¯ argmax P (Y = y |X = x)P (X = x). ¯ (9)ifying gray-level in a block, the watermark decoding be- xcomes block-wise. Here, the block-wise contextual infor-mation (i.e., the block-wise watermark smoothness condi- ¯ In (9), P (Y |X) and P (X) can be assumed to be block-wisetion) can be modeled by a Markov random fields (MRF). independent as followsThus, the watermark decoding turns out to be a block-wiseMAP (2-class) image segmentation problem, which is sim- ¯ P (Y |X) = ¯ P (Ys |Xs ), (10)ilar to [13][7]. s∈ΩB Let us denote the set of 2-D pixel indices in N1 × N2 im-age space as Ω = {(i, j) : 0 ≤ i ≤ N1 −1, 0 ≤ j ≤ N2 −1} andand the non-overlapping block indices with B × B size as P (X) ≈ P (Xs |Xηs ). (11)ΩB = {(i, j) : 0 ≤ i ≤ N1 − 1, 0 ≤ N2 − 1}. The B B s∈ΩBwatermark embedding is executed for each block s ∈ ΩB . Note that (11) is an approximation of the global Gibbs dis-That is, given the original image composed with an object tribution by a product of local characteristics. Since the aand a background, we divide the image space into B × B posteriori probability in (9) can be separable for each blocknon-overlapping blocks. Then, for each block s ∈ ΩB , we as in (10) and (11), we can independently extract the water-calculate the average brightness ys . If the majority of the ¯ mark bit x∗ for each block s by maximizing the following spixels in the block s belongs to an object, then the repre- local probabilitysentative watermark bit to be embedded is 1 (i.e., xs = 1).Otherwise, we embed watermark bit 0 (i.e., xs = 0). Then, x∗ = ¯ argmax[P (Ys |Xs )P (Xs |Xηs )] s xsthe gray-level at (k, l), 0 ≤ k, l ≤ B − 1, in a block s,ys (k, l), is modified as follows due to the watermark em- = ¯ argmax[lnP (Ys |Xs ) + lnP (Xs |Xηs )]. (12) xsbedding ys (k, j) + Q1 (¯s ) − ys , if xs = 1 y ¯ Then, by plugging (7) and (8) into (12) and taking a loga- ys (k, l) = ˆ (6) rithm ln, we can determine the block-wise MAP watermark ys (k, l) + Q0 (¯s ) − ys , if xs = 0, y ¯ decoding.where ys (k, l) is the watermarked gray-level at (k, l) in the ˆ Once the watermark bit for each block is obtained, theblock s ∈ ΩB . blocks in the boundary between the object and the back- On each block s, the watermark bit xs ∈ {0, 1} is as- ground are subdivided into four smaller blocks. Then, thesumed to be a realization of a random variable Xs , where watermark bit of each subdivided image block should bexs = 1 for the object and xs = 0 for the background. determined. For example, the watermark bit of subdividedDenoting the set of random variable (i.e., a random field) boundary block comes from one of its non-boundary neigh-X = {Xs : s ∈ ΩB }, we assume that X is a Gibbs random boring blocks with the smallest average gray-level differ-field with the conditional probability as ence. This subdivision and watermark determination con- 1 tinues until we have the pixel-level watermark extraction. P (Xs = xs |xηs ) = exp{Vc (xs )}, (7) Note that this process is similar to the watershed method Z used in [13][7].where Vc (xs ) is the clique potential of a clique c on theneighborhood system ηs . On top of the random field X, we ¯ ¯have the random field Y = {Ys : s ∈ ΩB }, where Ys = ¯ 4. Experimentsys ∈ {0, 1, · · · , 255} is a random variable in a block s. The¯ ¯realization of the random variable Ys , i.e., ys , represents ¯ The quantization step ∆ in (3) and (4) causes a trade-offthe average brightness of the pixels in the block s and is between the strength of the watermarking and the degrada-assumed to be Gaussian distributed with mean Q0 (¯s ) for y tion of the image quality. (∆ = 10 is used in this paper).watermark 0 and Q1 (¯s ) for watermark 1 and variance σ 2 . y For the clique potential Vc (xs ) in (7), a pair clique for theThus, we have second-order neighborhood system is used with the clique 1 potential as P (¯s |xs ) = √ y exp{−(¯s − Qxs (¯s ))2 /2σ 2 }. (8) y y  2πσ 2  β, if all watermarks in pair pixels Defining all necessary stochastic models, our block- Vc (xs ) = of s are same (13)based MAP watermark decoding is to extract x∗ given the  −β, otherwise 234
  5. 5. where we set β = 1. Also, we set σ = 1 in (8). x∗ in s [3] D.J. Chaplin, Chroma key method and apparatus, US(9) is updated iteratively with less than 5 iterations. At the Patent 5,249,039 1995. ¯first iteration, only P (Ys |Xs ) in (12) without P (Xs |Xηs ) isconsidered. [4] R.J. Qian and M.I. Sezan, Video background replace- The proposed watermark encoding and decoding method ment without a bliue screen, IEEE ICIP, 1999.resists against some post image processing attacks. For [5] M. Fiala and C. Shu, Background subtraction usingexample, in Fig 4, the original image Fig 4-(a) is water- self-identifying patterns, IEEE CRV, 2005.marked and JPEG compressed as in Fig 4-(b). Then, it isdecoded by the previous method [10] as in Fig 4-(c) and [6] Y-Y Chuang, B. Curless, D.H. Salesin, and R.is decoded by the proposed block-wise MAP decoding (see Szeliski, A bayesian approach to digital matting, IEEEFig 4-(d)), demonstrating the robust of the proposed method CVPR, 2001.to the JPEG compression. Also, as shown in Fig 5, the pro-posed method is also separates the background from the ob- [7] C.S. Won, K. Pyun, and R. M. Gray, Automatic objectject even if the watermarked image is cropped. segmentation in images with low depth of field, IEEE Proc. Of Image Processing (ICIP), III, 805-808, 2002. The proposed method needs to determine parameter val-ues (i.e., ∆, B, β, and σ) a priori. The parameter values [8] C. Gu and M.-C. Lee, Semiautomatic segmentationchosen in this paper are not exhaustively searched. Thus, and tracking of semantic video objects, IEEE Tr. Onthe chosen parameter values may not be optimal. For ex- Circ. and Sys. for video Tech., 8(5), 572-584, 1998.ample, the block size B is set to be 4 for Fig 4 and Fig 5.However, as shown in Fig 6-(c) and (d), the segmentation [9] Y. Gaobo and Y. Shengfa, Modified intelligent scissorsresult yields more errors with B = 4 than B = 8 for the and adaptive frame skipping for video object segmen-added noise. Analyzing the sensitivity of those parameter tation, Real-time Imaging, 11, 310-322, 2005.values to the segmentation result and determining the opti- [10] C.S. Won, On generating automatic-object-extractiblemal parameter values remain as a future work. images, Proc of SPIE, vol 6764, 2007.5. Conclusion [11] B. Chen and G.W. Wornell, Quantization index mod- ulation: A class of probably good methods for digital watermarking and information embedding, IEEE Tr. To embed robust watermark against image processing On Information Theory, 47, 1423-1443, 2001.techniques such as JPEG compressions, we embed the wa-termark to the average brightness value in an image block. [12] W. Lu, W. Li, R. Safavi-Naimi, and P. Ogubona, AThen, the embedded watermark can be efficiently extracted pixel-based robust image watermarking system, IEEEby adopting a block-wise MAP segmentation framework. ICME, 2006.We proposed the block-wise MAP QIM decoding schemein this paper and demonstrated its successful background [13] C.S.Won, A block-wise MAP segmentation for im-separation from object even the watermarked image goes age compression, IEEE Tr. on Circuits and Systemsthrough post-processing attacks including JPEG compres- for Video Technology, vol. 8, no. 5, pp.592-601, 1998.sions.6. Acknowledgement This work was supported by Seoul R&BD Program(SFCC).References [1] D. Roobaert, M. Zillich, and J.-O. Eklundh, A pure learning approach to background-invariant ob- ject recognition using pedagogical support vector ma- chine, Proc of CVPR, II, 351-357, 2001. [2] http://www.vision.caltech.edu/html-files/archive.html 235
  6. 6. (a) (b) (c) (d)Figure 4. Results with JPEG compressions: (a) Original image, (b) Watermark encoded and JPEGcompression applied (compression scale 10 in Photoshop), (c) Object extraction by the previousmethod [10], (d) Object extraction by the proposed method. (a) (b) (c) (d)Figure 5. Results with image cropping: (a) Original image, (b) Watermark encoded and croppedimage, (c) Object extraction by the previous method [10], (d) Object extraction by the proposedmethod. (a) (b) (c) (d)Figure 6. Comparison with different block sizes B: (a) Original image, (b) Watermark encoded anduniform noise added by 2% from the Photoshop, (c) Object extraction by B = 4, (d) Object extractionby B = 8. 236

×