image segmentation

Canadian Conference on Computer and Robot Vision

Automatic object extraction in images using embedded labels

Chee Sun Won
Dept of Electronic Eng, Dongguk University
Seoul, 100-715, South Korea
cswon@dongguk.edu

Abstract

To automatically generate images with the same fore-
ground but different backgrounds, a watermark bit (e.g.,
binary 1 for foreground and 0 for background) can be in-
serted for each pixel location. Then, the embedded water-
mark bit can be automatically extracted and the background
can be separated from the object. Note that the object ex-
traction can be done successfully only if the watermarked
image is intact. However, if the watermarked image goes
through some post-processing including JPEG compression
and cropping, then the pixel-wise watermark decoding may
fail. To overcome this problem, in this paper, a block-wise
watermark insertion and a block-wise MAP (maximum a
posteriori) watermark decoding are proposed. Experimen-
tal results show that the proposed method is more robust Figure 1. Example of the same object (face)
that the pixel-wise decoding for various post-processing at- with different background (Images from Cal-
tacks. tech Image Archive [2].)

1. Introduction with a uniform background so that the background can be
easily segmented. However, this technique requires a pro-
This paper deals with the problem of automatically sep- fessional studio with a physical blue screen. To avoid the
arating object/foreground from background in an image. cumbersome physical setting, in [4], they propose to pre-
Speciﬁcally, we focus on the problem of automatic back- shoot the background image. Then, the background im-
ground replacement in an image. Note that the demands for age is subtracted from another shoot with the same back-
various digital photo editing functionalities will be higher ground including an object. Also, in [5], self-identifying
as the performance of digital multimedia devices such as patterns are used to recognize the background. The sec-
digital camera and cell phone increases. Also, it is often re- ond approach basically relies on estimating the probability
quired to generate training images with the same object but of each pixel to decide whether it belongs to the object or
different backgrounds for automatic machine learning sys- the background. For example, a Bayesian framework has
tem. For example, suppose we need some training images been adopted for separating the background from the object
for the face recognition problem (e.g., see Fig 1). Then, to [6][7]. In particular, in [7], special images called low-depth-
increase the recognition accuracy, it is important to provide of-ﬁeld (LDOF) images are used for the automatic object
as many training images as possible with various variations segmentation. For this particular LDOF image, object in the
in the appearance of the object in the image [1]. image is focused and the rest of the background is blurred.
Previous background extraction methods are based on Then, the high frequency components residing inside the
three approaches. First approach is to exploit a known or object region are modeled with a probabilistic distribution
controlled environment. For example, blue-screen matting for Bayesian image segmentation framework. Finally, the
(also called chroma-key technique) [3] is to take a picture third approach for the background extraction involves the

978-0-7695-3153-3/08 $25.00 © 2008 IEEE 231
DOI 10.1109/CRV.2008.10

direct human intervention during the segmentation. This, for each pixel gray-level. Also, the embedded watermarks
so called semi-automatic image segmentation, requires a can be viewed as two dimensional Markov random fields
time-consuming and inconvenient user intervention to get a (MRF), then the watermark extraction can be reformulated
rough outline of the object boundary. In this semi-automatic as an MAP (maximum a posteriori) based image segmenta-
image segmentation problem, the issue is how to minimize tion problem.
and simplify the human’s intervention [8][9].
Note that the upper-mentioned background extraction 2. Pixel-wise QIM watermarking: Previous
methods belong to one-time solution. That is, whenever method
we need to extract or to replace the background from the
same image, we have to repeatedly apply one of the above Quantization Index Modulation (QIM) watermarking
three approaches, which is either semi-automatic or appli- scheme [11] was used to embed watermark bit for indicating
cable only to a limited case such as LDOF images. To solve background and object for the purpose of repetitive back-
this problem, once the object and the background are sep- ground extraction and replacement [10]. The QIM method
arated by one of the above three approaches, the extracted is a blind watermarking scheme and is known to be more
object and the background are slightly modified to embed robust than the spread spectrum watermarking. Also, the
a watermark. Then, it can be automatically identified for QIM method is a spatial domain watermarking. Note that
the later demands for background extractions or replace- since the watermark embedding should be done for irregu-
ments. This paper deals with this problem. That is, we are lar shaped object and background, it is not tractable to apply
interested in the repetitive requests for the background re- transform domain watermarking scheme on a rectangular
placement instead of non-repetitive one-time execution. Of region.
course, the prerequisite for our automatic background re- In the previous work [10], one-bit watermark is embed-
placement is that the very first object extraction should be ded to each pixel, i.e., watermark bit 1 is embedded to a
done by one of the above three approaches. Once we obtain pixel belonging to the object and watermark bit 0 to a pixel
separated object and background in image, we can embed belonging to the background. Specifically, for a pixel i, a
different binary bit for object and background to be used watermark bit bi is generated as follows
for automatic background replacement. Then, after the bi-
nary bit embedding, we can automatically separate the ob- 1, if pixel i belongs to the object/foreground
bi =
ject from the background by simply extracting and identify- 0, if pixel i belongs to the background.
ing the embedded bits. Thus, the embedded bits for the ob- (1)
ject and the background are inherited to subsequent image According to the watermark bit bi , the gray-level yi at pixel
composition and are used for later request for the separation i is modified to yi as follows
ˆ
of the background from the object.
Q1 (yi ), if bi = 1
In [10], the quantization index modulation (QIM) based yi =
ˆ (2)
Q0 (yi ), if bi = 0,
watermarking scheme [11] was used for each pixel to em-
bed watermark bit 1 and 0 for the object and the back- where Q1 (yi ) and Q0 (yi ) are defined as
ground, respectively (see Fig. 2 for the overall struc-
ture). Once the watermark embedded image, which is called yi − ∆/4 ∆
Q1 (yi ) = ×∆+ , (3)
automatic-object-extractible image, is generated, the auto- ∆ 4
matic background replacement problem is equivalent to the
and
extraction of the embedded watermark. As demonstrated yi + ∆/4 ∆
in [10], the separation of the object from the background Q0 (yi ) = ×∆− . (4)
∆ 4
can be done automatically. However, if the watermark em-
In (3) and (4), ζ is to convert ζ to an integer (i.e., a ceil
bedded (i.e., automatic-object-extractible) image has under-
function) and ∆ is a quantization step for the watermark
gone post-processing such as JPEG compression or corrup-
strength.
tion by additive noise etc., the watermarked pixel values
For a given watermarked image data yi , the embedded
ˆ
will be changed and there will be no guarantee to success-
watermark bit at pixel i can be extracted as follows
fully separate the object from the background. In this paper,
we propose a watermark embedding and extraction method, 1, if yi − Q1 (î ) < yi − Q0 (î )
ˆ y ˆ y
î =
b , (5)
which is robust to some unintentional image modifications 0, otherwise.
such as JPEG compressions, additive noises, and cropping
etc. The basic idea of the proposed method is to embed Note that if there is no alteration for the watermarked gray-
the watermark bit into the average value of a small image level yi , then the extracted watermark bit by (5), î , should
ˆ b
block instead of the modification of the least significant bits be identical to the original watermark bit bi . However, if the

232

Semi-automatic
Segmentation Contrast
Human Intervention Enhancement

Face

composition +

Original Image Face extractible
Image (automatic -
object-extractible
image)
Contrast
Suppression

Background

Figure 2. The overall structure of embedding watermark bit for the object and the background.

watermarked gray-level is changed due to some post pro-
cessing such as JPEG compression and noise addition, then
the extracted watermark bit ˆi does not necessarily match
b
the original watermark bit bi and the object extraction from
the background yields some segmentation errors. For exam-
ple, the watermarked image Fig. 3-(a) has undergone JPEG
compression as shown in Fig. 3-(b). As a result the water-
mark extraction by (5) yields a lot of segmentation errors
(see Fig. 3-(c)).

(a) (b)
3. Block-wise MAP QIM watermarking

As demonstrated in Fig 3, the pixel-wise watermark de-
coding used in [10] is vulnerable to post-processing applied
to watermarked images. To alleviate this problem, in this
paper, we propose to adopt a block-wise MAP (maximum a
posteriori) decoding scheme. Note that, in [12], a pixel-wise
MAP decoding for QIM watermarking was also used. This
scheme is based on a sliding window to embed the water-
mark to the local average value. It is certainly expected that
the robustness (especially against the JPEG compressions) (c) (d)
increases by embedding the watermark to the average val-
ues rather than pixel gray-levels. However, overlapping the
sliding window, the current watermarked pixel value may Figure 3. Example of segmentation errors: (a)
affect the watermarking to the next sliding window. Exper- watermarked image, (b) JPEG compressed
imental results reveal that the visual degradations occurred and decompressed image, (c) object extrac-
by modifying the current gray-level of a pixel to change the tion by (5), (d) original object.
local average brightness with the watermarked neighboring
pixel values. To overcome this problem, in this paper, the

233

watermark embedding is embedded to the average bright- watermarked and possibly modified image data y as follows
¯
ness for each non-overlapping block, spreading the modi-
fications to all pixels in the block rather than to a pixel in x∗ = ¯
argmax P (X = x|Y = y )
¯
x
a sliding window method. Also, by simultaneously mod-
= ¯
argmax P (Y = y |X = x)P (X = x).
¯ (9)
ifying gray-level in a block, the watermark decoding be- x
comes block-wise. Here, the block-wise contextual infor-
mation (i.e., the block-wise watermark smoothness condi- ¯
In (9), P (Y |X) and P (X) can be assumed to be block-wise
tion) can be modeled by a Markov random fields (MRF). independent as follows
Thus, the watermark decoding turns out to be a block-wise
MAP (2-class) image segmentation problem, which is sim- ¯
P (Y |X) = ¯
P (Ys |Xs ), (10)
ilar to [13][7]. s∈ΩB
Let us denote the set of 2-D pixel indices in N1 × N2 im-
age space as Ω = {(i, j) : 0 ≤ i ≤ N1 −1, 0 ≤ j ≤ N2 −1} and
and the non-overlapping block indices with B × B size as P (X) ≈ P (Xs |Xηs ). (11)
ΩB = {(i, j) : 0 ≤ i ≤ N1 − 1, 0 ≤ N2 − 1}. The
B B
s∈ΩB

watermark embedding is executed for each block s ∈ ΩB . Note that (11) is an approximation of the global Gibbs dis-
That is, given the original image composed with an object tribution by a product of local characteristics. Since the a
and a background, we divide the image space into B × B posteriori probability in (9) can be separable for each block
non-overlapping blocks. Then, for each block s ∈ ΩB , we as in (10) and (11), we can independently extract the water-
calculate the average brightness ys . If the majority of the
¯ mark bit x∗ for each block s by maximizing the following
s
pixels in the block s belongs to an object, then the repre- local probability
sentative watermark bit to be embedded is 1 (i.e., xs = 1).
Otherwise, we embed watermark bit 0 (i.e., xs = 0). Then, x∗ = ¯
argmax[P (Ys |Xs )P (Xs |Xηs )]
s
xs
the gray-level at (k, l), 0 ≤ k, l ≤ B − 1, in a block s,
ys (k, l), is modified as follows due to the watermark em- = ¯
argmax[lnP (Ys |Xs ) + lnP (Xs |Xηs )]. (12)
xs
bedding
ys (k, j) + Q1 (¯s ) − ys , if xs = 1
y ¯ Then, by plugging (7) and (8) into (12) and taking a loga-
ys (k, l) =
ˆ (6) rithm ln, we can determine the block-wise MAP watermark
ys (k, l) + Q0 (¯s ) − ys , if xs = 0,
y ¯
decoding.
where ys (k, l) is the watermarked gray-level at (k, l) in the
ˆ Once the watermark bit for each block is obtained, the
block s ∈ ΩB . blocks in the boundary between the object and the back-
On each block s, the watermark bit xs ∈ {0, 1} is as- ground are subdivided into four smaller blocks. Then, the
sumed to be a realization of a random variable Xs , where watermark bit of each subdivided image block should be
xs = 1 for the object and xs = 0 for the background. determined. For example, the watermark bit of subdivided
Denoting the set of random variable (i.e., a random field) boundary block comes from one of its non-boundary neigh-
X = {Xs : s ∈ ΩB }, we assume that X is a Gibbs random boring blocks with the smallest average gray-level differ-
field with the conditional probability as ence. This subdivision and watermark determination con-
1 tinues until we have the pixel-level watermark extraction.
P (Xs = xs |xηs ) = exp{Vc (xs )}, (7) Note that this process is similar to the watershed method
Z
used in [13][7].
where Vc (xs ) is the clique potential of a clique c on the
neighborhood system ηs . On top of the random field X, we
¯ ¯
have the random field Y = {Ys : s ∈ ΩB }, where Ys = ¯ 4. Experiments
ys ∈ {0, 1, · · · , 255} is a random variable in a block s. The
¯
¯
realization of the random variable Ys , i.e., ys , represents
¯ The quantization step ∆ in (3) and (4) causes a trade-off
the average brightness of the pixels in the block s and is between the strength of the watermarking and the degrada-
assumed to be Gaussian distributed with mean Q0 (¯s ) for
y tion of the image quality. (∆ = 10 is used in this paper).
watermark 0 and Q1 (¯s ) for watermark 1 and variance σ 2 .
y For the clique potential Vc (xs ) in (7), a pair clique for the
Thus, we have second-order neighborhood system is used with the clique
1 potential as
P (¯s |xs ) = √
y exp{−(¯s − Qxs (¯s ))2 /2σ 2 }. (8)
y y 
2πσ 2
 β, if all watermarks in pair pixels
Defining all necessary stochastic models, our block- Vc (xs ) = of s are same (13)
based MAP watermark decoding is to extract x∗ given the 
−β, otherwise

234

where we set β = 1. Also, we set σ = 1 in (8). x∗ in s [3] D.J. Chaplin, Chroma key method and apparatus, US
(9) is updated iteratively with less than 5 iterations. At the Patent 5,249,039 1995.
¯
first iteration, only P (Ys |Xs ) in (12) without P (Xs |Xηs ) is
considered. [4] R.J. Qian and M.I. Sezan, Video background replace-
The proposed watermark encoding and decoding method ment without a bliue screen, IEEE ICIP, 1999.
resists against some post image processing attacks. For [5] M. Fiala and C. Shu, Background subtraction using
example, in Fig 4, the original image Fig 4-(a) is water- self-identifying patterns, IEEE CRV, 2005.
marked and JPEG compressed as in Fig 4-(b). Then, it is
decoded by the previous method [10] as in Fig 4-(c) and [6] Y-Y Chuang, B. Curless, D.H. Salesin, and R.
is decoded by the proposed block-wise MAP decoding (see Szeliski, A bayesian approach to digital matting, IEEE
Fig 4-(d)), demonstrating the robust of the proposed method CVPR, 2001.
to the JPEG compression. Also, as shown in Fig 5, the pro-
posed method is also separates the background from the ob- [7] C.S. Won, K. Pyun, and R. M. Gray, Automatic object
ject even if the watermarked image is cropped. segmentation in images with low depth of field, IEEE
Proc. Of Image Processing (ICIP), III, 805-808, 2002.
The proposed method needs to determine parameter val-
ues (i.e., ∆, B, β, and σ) a priori. The parameter values [8] C. Gu and M.-C. Lee, Semiautomatic segmentation
chosen in this paper are not exhaustively searched. Thus, and tracking of semantic video objects, IEEE Tr. On
the chosen parameter values may not be optimal. For ex- Circ. and Sys. for video Tech., 8(5), 572-584, 1998.
ample, the block size B is set to be 4 for Fig 4 and Fig 5.
However, as shown in Fig 6-(c) and (d), the segmentation [9] Y. Gaobo and Y. Shengfa, Modified intelligent scissors
result yields more errors with B = 4 than B = 8 for the and adaptive frame skipping for video object segmen-
added noise. Analyzing the sensitivity of those parameter tation, Real-time Imaging, 11, 310-322, 2005.
values to the segmentation result and determining the opti-
[10] C.S. Won, On generating automatic-object-extractible
mal parameter values remain as a future work.
images, Proc of SPIE, vol 6764, 2007.

5. Conclusion [11] B. Chen and G.W. Wornell, Quantization index mod-
ulation: A class of probably good methods for digital
watermarking and information embedding, IEEE Tr.
To embed robust watermark against image processing On Information Theory, 47, 1423-1443, 2001.
techniques such as JPEG compressions, we embed the wa-
termark to the average brightness value in an image block. [12] W. Lu, W. Li, R. Safavi-Naimi, and P. Ogubona, A
Then, the embedded watermark can be efficiently extracted pixel-based robust image watermarking system, IEEE
by adopting a block-wise MAP segmentation framework. ICME, 2006.
We proposed the block-wise MAP QIM decoding scheme
in this paper and demonstrated its successful background [13] C.S.Won, A block-wise MAP segmentation for im-
separation from object even the watermarked image goes age compression, IEEE Tr. on Circuits and Systems
through post-processing attacks including JPEG compres- for Video Technology, vol. 8, no. 5, pp.592-601, 1998.
sions.

6. Acknowledgement

This work was supported by Seoul R&BD Program
(SFCC).

References

[1] D. Roobaert, M. Zillich, and J.-O. Eklundh, A
pure learning approach to background-invariant ob-
ject recognition using pedagogical support vector ma-
chine, Proc of CVPR, II, 351-357, 2001.

[2] http://www.vision.caltech.edu/html-files/archive.html

235

(a) (b) (c) (d)

Figure 4. Results with JPEG compressions: (a) Original image, (b) Watermark encoded and JPEG
compression applied (compression scale 10 in Photoshop), (c) Object extraction by the previous
method [10], (d) Object extraction by the proposed method.

(a) (b) (c) (d)

Figure 5. Results with image cropping: (a) Original image, (b) Watermark encoded and cropped
image, (c) Object extraction by the previous method [10], (d) Object extraction by the proposed
method.

(a) (b) (c) (d)

Figure 6. Comparison with different block sizes B: (a) Original image, (b) Watermark encoded and
uniform noise added by 2% from the Photoshop, (c) Object extraction by B = 4, (d) Object extraction
by B = 8.

236

image segmentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to image segmentation

Similar to image segmentation (20)

Recently uploaded

Recently uploaded (20)

image segmentation