Iaetsd degraded document image enhancing in

Degraded Document Image Enhancing in
Spatial Domain using Adaptive Contrasting and
Thresholding
Dr. P V Ramaraju, Professor
Department of ECE,
SRKR Engineering College,
Bhimavaram, India.
pvrraju50@gmail.com
G.Nagaraju, Asst.Professor,
Department of ECE,
Bhimavaram, India.
bhanu.raj.nikhil@gmail.com
V.Rajasekhar, M.Tech Student
Department of ECE,
Bhimavaram, India.
vrm.rajasekhar@gmail.com
Abstract: This paper presents a new adaptive approach
for the binarization and enhancement of degraded
document images. The proposed method can deal with
degradations which occur due to shadows, non-uniform
illumination, low contrast, ink bleeding-through, smear
and strain. We follow several distinct steps in the
proposed technique; an adaptive contrast map is first
constructed for an input degraded document image.
The contrast map is then binarized by using local
threshold that is estimated based on the intensities of
detected text stroke edge pixels within a local window.
Some post-processing is further applied to improve the
document binarization quality. The proposed method is
simple, robust, and involves minimum parameter
tuning.
Keywords--Adaptive contrast map, document image
enhancing, Adaptive thresholding, degraded document
image binarization, pixel classification.
I. INTRODUCTION
Robust binarization gives the possibility of a
correct extraction of the sketched line drawing or text
from its background. For the binarization of images
many algorithms have been implemented.
Thresholding is a sufficiently accurate and high
processing speed segmentation approach to
monochrome image. This paper describes a modified
logical thresholding method for binarization of
seriously degraded and very poor quality gray-scale
document images.
In general there are two types of image
thresholding techniques available: global and local.
In the global thresholding technique a gray level
image is converted into a binary image based on an
image intensity value called global threshold which is
fixed in the whole image domain whereas in local
thresholding technique, threshold value can vary
from one pixel location to next. Thus, global
thresholding converts an input image I to a binary
image G as follows G(i, j) = 1 for I (i, j) ≥ T, or,
G(i, j) = 0 for I (i, j) < T, where T is the threshold,
G (i, j) = 1 for foreground and G (i, j) = 0 for
background.
Whereas, for a local threshold, the threshold
T is a function over the image domain, i.e.,T= T(x, y).
In addition, if in constructing the threshold
value/surface the algorithm adapts itself to the image
intensity values, then it is called dynamic or adaptive
threshold.
In a general setting, thresholding can be
expressed as a test operation that tests against a
function T of the form [1]: T = T[x , y, h, I ] ,
where, I is the input image and h denotes some local
property of this point– for example, the average gray
level of a neighborhood centered on (x, y).
Threshold selection depends on the
information available in the gray level histogram of
the image. We know that an image function I(x, y)
can be expressed as the product of a reflectance
function and an illumination function based on a
simple image formation model. If the illumination
component of the image is uniform then the gray
level histogram of the image is clearly bimodal,
because the gray levels of object pixels are
significantly different from the gray levels of the
background. It indicates that one mode is populated
from object pixels and the other mode is populated
from background pixels.
Then objects could be easily partitioned by
placing a single global threshold at the neck or valley
at the histogram. However, in reality bimodality in
histograms does not occur very often. Consequently,
a fixed threshold level based on the information of
the gray level histogram will fail totally to separate
objects from the background. In this scenario we turn
our attention to adaptive local threshold surface
where threshold value changes over the image
domain to fit the spatially changing background and
lighting conditions.
Over the years many threshold selection
methods have been proposed. Otsu has suggested a
global image thresholding technique where the
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
17
ISBN:378-26-138420-0223

optimal global threshold value is ascertained by
maximizing the between-class variance with an
exhaustive search [2]. Sahoo et al. [3] claim that
Otsu’s method is suitable for real world applications
with regard to uniformity and shape measures.
Though Otsu’s method is one of the most popular
methods for global thresholding, it does not work
well for many real world images where a significant
overlap exists in the gray level histogram between the
pixel intensity values of the objects and the
background due to un-even and poor illumination.
As many degraded documents do not have a
clear bimodal pattern, global thresholding [4]–[7] is
usually not a suitable approach for the degraded
document binarization. Adaptive thresholding [8]–
[14], which estimates a local threshold for each
document image pixel, is often a better approach to
deal with different variations within degraded
document images. For example, the early window-
based adaptive thresholding techniques [12], [13]
estimate the local threshold by using the mean and
the standard variation of image pixels within a local
neighborhood window.
The local image contrast and the local image gradient
are very useful features for segmenting the text from
the document background because the document text
usually has certain image contrast to the neighboring
document background.
The image gradient is defined as follows
G(x,y)=fmax(x, y) − fmin(x,y ) 1
The Local contrast is defined as follows
max min
max min
( , ) ( , )
( , ) 2
( , ) ( , )
x y x y
D x y
x y x y
f f
f f ε
−
= →
+ +
Where ε is a positive but infinitely small number that
is added in case the local maximum is equal to 0.
II. PROPOSED METHOD
This section describes the proposed
document image binarization techniques. Given a
degraded document image, an adaptive contrast map
is first constructed. The text is then segmented based
on the local threshold that is estimated from the
detected text stroke edge pixels. Some post
processing is further applied to improve the
document binarization quality.
A. Contrast Image Construction
The image contrast in Equation 2 has one
typical limitation that it may not handle document
images with the bright text properly. This is because
a weak contrast will be calculated for stroke edges of
the bright text where the denominator in Equation 2
will be large but the numerator will be small. To
overcome this over-normalization problem, we
combine the local image contrast with the local
image gradient and derive an adaptive local image
contrast as follows
max min( , ) ( , ) (1 )( ( , ) ( , )) 3aD x y D x y f x y f x yα α= + − − →
Where D(x, y) denotes the local contrast in
Equation 2 and (fmax(x, y) − fmin(x,y )) refers to the
local image gradient that is normalized to [0, 1]. The
local windows size is set to 3 empirically. α is the
weight between local contrast and local gradient that
is controlled based on the document image statistical
information. Ideally, the image contrast will be
assigned with a high weight (i.e. large α) when the
document image has significant intensity variation.
So that the proposed binarization technique depends
more on the local image contrast that can capture the
intensity variation well and hence produce good
results. Otherwise, the local image gradient will be
assigned with a high weight.
We model the mapping from document image
intensity variation to α by a power function as
follows
α= (Std/128)γ
4
Where Std denotes the document image
intensity standard deviation, and γ is a pre-defined
parameter. The power function has a nice property in
that it monotonically and smoothly increases from 0
to 1 and its shape can be easily controlled by
different γ .γ can be selected from [0,∞], where the
power function becomes a linear function when γ
= 1. Therefore, the local image gradient will play the
major role in Equation 3 when γ is large and the local
image contrast will play the major role when γ is
small. The setting of parameter γ will be discussed in
the section of parameter selection.
Fig. 1
18
ISBN:378-26-138420-0224

Fig. 2
(a)
(b)
(c)
Fig. 3. Contrast Images constructed using (a) local image gradient,
(b) local image contrast [15], and (c) our proposed method for the
original sample document images which are shown in Fig. 1 and 2,
respectively.
Fig. 3 shows the contrast map of the sample
document images in Fig. 1 and 2 that are created by
using local image gradient, local image contrast [15]
and our proposed method in Equation 3, respectively.
B. Local Threshold Estimation
The text can then be extracted from the
document background pixels once the high contrast
stroke edge pixels are detected properly. Two
characteristics can be observed from different kinds
of document images [15]: First, the text pixels are
close to the detected text stroke edge pixels. Second,
there is a distinct intensity difference between the
high contrast stroke edge pixels and the surrounding
background pixels. The document image text can
thus be extracted based on the detected text stroke
edge pixels as follows
{1.. min&& ( , ) /2
0..( , ) 5Ne N I x y Emean Estd
otherwiseR x y ≥ ≤ +
= →
Where Emean and Estd are the mean and the
standard deviation of the image intensity of the
detected high contrast image pixels (within the
original document image) within the neighborhood
window that can be evaluated as follows
mean
( , )*(1 ( , ))
E 6
neighbor
I x y E x y
Ne
−
= →
∑
2
(( ( , ) )*(1 ( , )))
E 7
2
neighbor
std
I x y Emean E x y− −
= →
∑
The size of the neighborhood window W can
be set based on the stroke width of the document
image under study.
C. Post-Processing
Once the initial binarization result is derived
from Equation 5 as described in previous subsections,
the binarization result can be further improved by
incorporating certain domain knowledge as described
in Algorithm 1. First, the isolated foreground pixels
that do not connect with other foreground pixels are
filtered out to make the edge pixel set precisely.
Second, the neighborhood pixel pair that lies on
symmetric sides of a text stroke edge pixel should
belong to different classes (i.e., either the document
background or the foreground text). One pixel of the
pixel pair is therefore labeled to the other category if
both of the two pixels belong to the same class.
Finally, some single-pixel artifacts along the text
stroke boundaries are filtered out by using several
logical operators as described in[16].
Algorithm 1 Post-Processing Procedure
Require: The Input grayscale Document Image ‘I’,
Initial Binary Result ‘B’ and Corresponding Binary
Text Stroke Edge Image ‘Edge’
Ensure: The Final Binary Result ‘Bf’
1: Obtain the connect components of the stroke edge
pixels in ‘Edge’.
2: Take out those pixels that do not connect with
other pixels.
3: For removing isolated pixels, we need to check
connectivity.
4: for Each remaining edge pixels (i, j ): do
5: Get its neighborhood pairs:
(i − 1, j) & (i + 1, j); (i, j − 1) &(i, j + 1)
6: if The pixels in the pairs belong to the same class
(both text or background) then
7: Classify the foreground and background pixels
based on pixel values.
8: end if
9: end for
10: Remove single-pixel artifacts [16] along the text
stroke boundaries after the document thresholding.
11: Store the new binary result to ‘Bf ‘.
19
ISBN:378-26-138420-0225

D. Parameter Selection
In the first experiment, we apply different γ
to obtain different power functions and test their
performance. α is close to 1 when γ is small and the
local image contrast Da dominates the adaptive image
contrast Da in Equation 3. On the other hand, Da is
mainly influenced by local image gradient when γ is
large. At the same time, the variation of α for
different document images increases when γ is close
to 1. Under such circumstance, the power function
becomes more sensitive to the global image intensity
variation and appropriate weights can be assigned to
images with different characteristics.
The proposed method can assign more suitable α to
different images when γ is closer to 1. Parameter γ
should therefore be set around 1 when the
adaptability of the proposed technique is maximized
and better and more robust binarization results can be
derived from different kinds of degraded document
images.
III. RESULTS
This section evaluates the results for
proposed document image binarization techniques.
Given a degraded document image, an adaptive
contrast map is first constructed. The text is then
segmented based on the local threshold that is
estimated from the detected text stroke edge pixels.
Some post-processing is further applied to improve
the document binarization quality.
Example 1
Fig.4 input degraded document image having ink bleeding through
effect
Fig.5 Contrast image constructed based on proposed adaptive local
contrast map
Fig.6 Binarized resultant image constructed based on proposed
local thresholding and post processing.
Example 2
Fig.7 input degraded document image having ink bleeding through
effect
Fig.8 Contrast image constructed based on proposed adaptive local
contrast map
Fig.9 Binarized resultant image constructed based on proposed
local thresholding and post processing.
IV. COMPARISON OF DIFFERENT
BINARIZATION METHODS
In this experiment, we quantitatively
compare our proposed method with Otsu’s method
(OTSU) [2], Sauvola’s method (SAUV) [12],
Niblack’s method (NIBL) [13], Bernsen’s method
(BERN) [8], Gatos et al.’s method (GATO) [17], and
LMM method (LMM [15], BE [16]). These are
composed of the same series of document images that
suffer from several common document degradations
20
ISBN:378-26-138420-0226

such as smear, smudge, bleed-through and low
contrast.
Example 1
Fig. 10. Binarization results of the sample document image in Fig.
1(b) produced by different methods. (a) OTSU [2]. (b) SAUV [12].
(c) NIBL [13]. (d) BERN [8]. (e) GATO [17]. (f) LMM [15]. (g)
BE [16]. (h) Proposed.
Example 2
Fig. 11. Binarization results of the sample document image (PR
06) in DIBCO 2011 dataset produced by different methods. (a)
Input Image. (b) OTSU [2]. (c) SAUV [12]. (d) NIBL [13]. (e)
BERN [8]. (f) GATO [17]. (g) LMM [15]. (h) BE [16]. (i) LELO
[18]. (j) SNUS. (k) HOWE [19]. (l) Proposed.
V. CONCLUSION
This paper presents a simple and robust
method of enhancing degraded document images.
The method proposed in this paper constitutes
binarization that is tolerant to different types of
document degradation such as non-uniform
illumination, ink bleeding through and document
smear. This image binarization is based on local
thresholding along with adaptive contrast mapping.
The proposed method has been tested over various
noise affected document images and is binarized
efficaciously. But we observed that the performance
on Bickley diary dataset needs to be improved, we
will explore it in future.
REFERENCES
[1] R. C. Gonzalez and R. E. Woods, Digital Image
Processing, Pearson prentice Hall, 2005.
[2] N. Otsu, “A threshold selection method from
gray– level histogram,” IEEE Transactions on System
Man Cybernatics, Vol. SMC-9, No.1, pp. 62-66,
1979.
[3] P.K. Sahoo, S. Soltani, A.K.C. Wong, and Y.
Chen, “A survey of thresholding techniques,”
Computer Vision Graphics Image Processing, Vol.
41, 1988, pp. 233 – 260.
[4] A. Brink, “Thresholding of digital images using
two-dimensional entropies,” Pattern Recognit., vol.
25, no. 8, pp. 803–808, 1992.
[5] J. Kittler and J. Illingworth, “On threshold
selection using clustering criteria,” IEEE Trans. Syst.,
Man, Cybern., vol. 15, no. 5, pp. 652–655, Sep.–Oct.
1985.
[6] N. Otsu, “A threshold selection method from gray
level histogram,” IEEE Trans. Syst., Man, Cybern.,
vol. 19, no. 1, pp. 62–66, Jan. 1979.
[7] N. Papamarkos and B. Gatos, “A new approach
for multithreshold selection,” Comput. Vis. Graph.
Image Process., vol. 56, no. 5, pp. 357–370, 1994.
[8] J. Bernsen, “Dynamic thresholding of gray-level
images,” in Proc. Int. Conf. Pattern Recognit., Oct.
1986, pp. 1251–1255.
[9] L. Eikvil, T. Taxt, and K. Moen, “A fast adaptive
method for binarization of document images,” in
Proc. Int. Conf. Document Anal. Recognit., Sep.
1991, pp. 435–443.
[10] I.-K. Kim, D.-W. Jung, and R.-H. Park,
“Document image binarization based on topographic
analysis using a water flow model,” Pattern
Recognit., vol. 35, no. 1, pp. 265–277, 2002.
[11] J. Parker, C. Jennings, and A. Salkauskas,
“Thresholding using an illumination model,” in Proc.
Int. Conf. Doc. Anal. Recognit., Oct. 1993, pp. 270–
273.
21
ISBN:378-26-138420-0227

[12] J. Sauvola and M. Pietikainen, “Adaptive
document image binarization,” Pattern Recognit.,
vol. 33, no. 2, pp. 225–236, 2000.
[13] W. Niblack, An Introduction to Digital Image
Processing. Englewood Cliffs, NJ: Prentice-Hall,
1986.
[14] J.-D. Yang, Y.-S. Chen, and W.-H. Hsu,
“Adaptive thresholding algorithm and its hardware
implementation,” Pattern Recognit. Lett., vol. 15, no.
2, pp. 141–150, 1994.
[15] B. Su, S. Lu, and C. L. Tan, “Binarization of
historical handwritten document images using local
maximum and minimum filter,” in Proc. Int.
Workshop Document Anal. Syst., Jun. 2010, pp. 159–
166.
[16] S. Lu, B. Su, and C. L. Tan, “Document image
binarization using background estimation and stroke
edges,” Int. J. Document Anal. Recognit., vol. 13, no.
4, pp. 303–314, Dec. 2010.
[17] B. Gatos, I. Pratikakis, and S. Perantonis,
“Adaptive degraded document image binarization,”
Pattern Recognit., vol. 39, no. 3, pp. 317–327, 2006.
[18] T. Lelore and F. Bouchara, “Super resolved
binarization of text based on the fair algorithm,” in
Proc. Int. Conf. Document Anal. Recognit., Sep.
2011, pp. 839–843.
[19] N. Howe, “A Laplacian energy for document
binarization,” in Proc. Int. Conf. Doc. Anal.
Recognit., Sep. 2011, pp. 6–10.
Dr.P.V.RamaRaju
working as a Professor at
the Department of
Electronics and
Communication
Engineering, S.R.K.R.
Engineering College,
AP, India.
His research interests include Biomedical-Signal
Processing, Signal Processing, VLSI Design and
Microwave Anechoic Chambers Design. He is author
of several research studies published in national and
international journals and conference proceedings.
Nagaraju.G
presently working as
Assistant Professor at
the Department of
Electronics and
Communication
Engineering, S.R.K.R.
Engineering College,
Bhimavaram, AP,
He received the B.Tech degree from S.R.K.R.
Engineering College, Bhimavaram in 2002, and
M. Tech degree in Computer electronics
specialization from Govt. College of engg, Pune
university in 2004. His current research interests
include Image processing, digital security
systems, Biomedical-Signal Processing, Signal
Processing, and VLSI Design.
V.RAJASEKHAR
received the B-tech
degree in Electronics
and communication
engineering from Sri
Vasavi Engineering
college,Tadepalligudem
, A.P, India, in 2011.
He is currently pursuing the M.Tech degree in
Communication Systems from S.R.K.R.
Engineering college, Bhimavaram .
22
ISBN:378-26-138420-0228

Iaetsd degraded document image enhancing in

More Related Content

What's hot

Viewers also liked

Similar to Iaetsd degraded document image enhancing in

More from Iaetsd Iaetsd

Recently uploaded

Iaetsd degraded document image enhancing in