Degraded Document Image Enhancing in
Spatial Domain using Adaptive Contrasting and
Thresholding
Dr. P V Ramaraju, Professor
Department of ECE,
SRKR Engineering College,
Bhimavaram, India.
pvrraju50@gmail.com
G.Nagaraju, Asst.Professor,
Department of ECE,
SRKR Engineering College,
Bhimavaram, India.
bhanu.raj.nikhil@gmail.com
V.Rajasekhar, M.Tech Student
Department of ECE,
SRKR Engineering College,
Bhimavaram, India.
vrm.rajasekhar@gmail.com
Abstract: This paper presents a new adaptive approach
for the binarization and enhancement of degraded
document images. The proposed method can deal with
degradations which occur due to shadows, non-uniform
illumination, low contrast, ink bleeding-through, smear
and strain. We follow several distinct steps in the
proposed technique; an adaptive contrast map is first
constructed for an input degraded document image.
The contrast map is then binarized by using local
threshold that is estimated based on the intensities of
detected text stroke edge pixels within a local window.
Some post-processing is further applied to improve the
document binarization quality. The proposed method is
simple, robust, and involves minimum parameter
tuning.
Keywords--Adaptive contrast map, document image
enhancing, Adaptive thresholding, degraded document
image binarization, pixel classification.
I. INTRODUCTION
Robust binarization gives the possibility of a
correct extraction of the sketched line drawing or text
from its background. For the binarization of images
many algorithms have been implemented.
Thresholding is a sufficiently accurate and high
processing speed segmentation approach to
monochrome image. This paper describes a modified
logical thresholding method for binarization of
seriously degraded and very poor quality gray-scale
document images.
In general there are two types of image
thresholding techniques available: global and local.
In the global thresholding technique a gray level
image is converted into a binary image based on an
image intensity value called global threshold which is
fixed in the whole image domain whereas in local
thresholding technique, threshold value can vary
from one pixel location to next. Thus, global
thresholding converts an input image I to a binary
image G as follows G(i, j) = 1 for I (i, j) ≥ T, or,
G(i, j) = 0 for I (i, j) < T, where T is the threshold,
G (i, j) = 1 for foreground and G (i, j) = 0 for
background.
Whereas, for a local threshold, the threshold
T is a function over the image domain, i.e.,T= T(x, y).
In addition, if in constructing the threshold
value/surface the algorithm adapts itself to the image
intensity values, then it is called dynamic or adaptive
threshold.
In a general setting, thresholding can be
expressed as a test operation that tests against a
function T of the form [1]: T = T[x , y, h, I ] ,
where, I is the input image and h denotes some local
property of this point– for example, the average gray
level of a neighborhood centered on (x, y).
Threshold selection depends on the
information available in the gray level histogram of
the image. We know that an image function I(x, y)
can be expressed as the product of a reflectance
function and an illumination function based on a
simple image formation model. If the illumination
component of the image is uniform then the gray
level histogram of the image is clearly bimodal,
because the gray levels of object pixels are
significantly different from the gray levels of the
background. It indicates that one mode is populated
from object pixels and the other mode is populated
from background pixels.
Then objects could be easily partitioned by
placing a single global threshold at the neck or valley
at the histogram. However, in reality bimodality in
histograms does not occur very often. Consequently,
a fixed threshold level based on the information of
the gray level histogram will fail totally to separate
objects from the background. In this scenario we turn
our attention to adaptive local threshold surface
where threshold value changes over the image
domain to fit the spatially changing background and
lighting conditions.
Over the years many threshold selection
methods have been proposed. Otsu has suggested a
global image thresholding technique where the
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
17
ISBN:378-26-138420-0223
optimal global threshold value is ascertained by
maximizing the between-class variance with an
exhaustive search [2]. Sahoo et al. [3] claim that
Otsu’s method is suitable for real world applications
with regard to uniformity and shape measures.
Though Otsu’s method is one of the most popular
methods for global thresholding, it does not work
well for many real world images where a significant
overlap exists in the gray level histogram between the
pixel intensity values of the objects and the
background due to un-even and poor illumination.
As many degraded documents do not have a
clear bimodal pattern, global thresholding [4]–[7] is
usually not a suitable approach for the degraded
document binarization. Adaptive thresholding [8]–
[14], which estimates a local threshold for each
document image pixel, is often a better approach to
deal with different variations within degraded
document images. For example, the early window-
based adaptive thresholding techniques [12], [13]
estimate the local threshold by using the mean and
the standard variation of image pixels within a local
neighborhood window.
The local image contrast and the local image gradient
are very useful features for segmenting the text from
the document background because the document text
usually has certain image contrast to the neighboring
document background.
The image gradient is defined as follows
G(x,y)=fmax(x, y) − fmin(x,y ) 1
The Local contrast is defined as follows
max min
max min
( , ) ( , )
( , ) 2
( , ) ( , )
x y x y
D x y
x y x y
f f
f f ε
−
= →
+ +
Where ε is a positive but infinitely small number that
is added in case the local maximum is equal to 0.
II. PROPOSED METHOD
This section describes the proposed
document image binarization techniques. Given a
degraded document image, an adaptive contrast map
is first constructed. The text is then segmented based
on the local threshold that is estimated from the
detected text stroke edge pixels. Some post
processing is further applied to improve the
document binarization quality.
A. Contrast Image Construction
The image contrast in Equation 2 has one
typical limitation that it may not handle document
images with the bright text properly. This is because
a weak contrast will be calculated for stroke edges of
the bright text where the denominator in Equation 2
will be large but the numerator will be small. To
overcome this over-normalization problem, we
combine the local image contrast with the local
image gradient and derive an adaptive local image
contrast as follows
max min( , ) ( , ) (1 )( ( , ) ( , )) 3aD x y D x y f x y f x yα α= + − − →
Where D(x, y) denotes the local contrast in
Equation 2 and (fmax(x, y) − fmin(x,y )) refers to the
local image gradient that is normalized to [0, 1]. The
local windows size is set to 3 empirically. α is the
weight between local contrast and local gradient that
is controlled based on the document image statistical
information. Ideally, the image contrast will be
assigned with a high weight (i.e. large α) when the
document image has significant intensity variation.
So that the proposed binarization technique depends
more on the local image contrast that can capture the
intensity variation well and hence produce good
results. Otherwise, the local image gradient will be
assigned with a high weight.
We model the mapping from document image
intensity variation to α by a power function as
follows
α= (Std/128)γ
4
Where Std denotes the document image
intensity standard deviation, and γ is a pre-defined
parameter. The power function has a nice property in
that it monotonically and smoothly increases from 0
to 1 and its shape can be easily controlled by
different γ .γ can be selected from [0,∞], where the
power function becomes a linear function when γ
= 1. Therefore, the local image gradient will play the
major role in Equation 3 when γ is large and the local
image contrast will play the major role when γ is
small. The setting of parameter γ will be discussed in
the section of parameter selection.
Fig. 1
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
18
ISBN:378-26-138420-0224
Fig. 2
(a)
(b)
(c)
Fig. 3. Contrast Images constructed using (a) local image gradient,
(b) local image contrast [15], and (c) our proposed method for the
original sample document images which are shown in Fig. 1 and 2,
respectively.
Fig. 3 shows the contrast map of the sample
document images in Fig. 1 and 2 that are created by
using local image gradient, local image contrast [15]
and our proposed method in Equation 3, respectively.
B. Local Threshold Estimation
The text can then be extracted from the
document background pixels once the high contrast
stroke edge pixels are detected properly. Two
characteristics can be observed from different kinds
of document images [15]: First, the text pixels are
close to the detected text stroke edge pixels. Second,
there is a distinct intensity difference between the
high contrast stroke edge pixels and the surrounding
background pixels. The document image text can
thus be extracted based on the detected text stroke
edge pixels as follows
{1.. min&& ( , ) /2
0..( , ) 5Ne N I x y Emean Estd
otherwiseR x y ≥ ≤ +
= →
Where Emean and Estd are the mean and the
standard deviation of the image intensity of the
detected high contrast image pixels (within the
original document image) within the neighborhood
window that can be evaluated as follows
mean
( , )*(1 ( , ))
E 6
neighbor
I x y E x y
Ne
−
= →
∑
2
(( ( , ) )*(1 ( , )))
E 7
2
neighbor
std
I x y Emean E x y− −
= →
∑
The size of the neighborhood window W can
be set based on the stroke width of the document
image under study.
C. Post-Processing
Once the initial binarization result is derived
from Equation 5 as described in previous subsections,
the binarization result can be further improved by
incorporating certain domain knowledge as described
in Algorithm 1. First, the isolated foreground pixels
that do not connect with other foreground pixels are
filtered out to make the edge pixel set precisely.
Second, the neighborhood pixel pair that lies on
symmetric sides of a text stroke edge pixel should
belong to different classes (i.e., either the document
background or the foreground text). One pixel of the
pixel pair is therefore labeled to the other category if
both of the two pixels belong to the same class.
Finally, some single-pixel artifacts along the text
stroke boundaries are filtered out by using several
logical operators as described in[16].
Algorithm 1 Post-Processing Procedure
Require: The Input grayscale Document Image ‘I’,
Initial Binary Result ‘B’ and Corresponding Binary
Text Stroke Edge Image ‘Edge’
Ensure: The Final Binary Result ‘Bf’
1: Obtain the connect components of the stroke edge
pixels in ‘Edge’.
2: Take out those pixels that do not connect with
other pixels.
3: For removing isolated pixels, we need to check
connectivity.
4: for Each remaining edge pixels (i, j ): do
5: Get its neighborhood pairs:
(i − 1, j) & (i + 1, j); (i, j − 1) &(i, j + 1)
6: if The pixels in the pairs belong to the same class
(both text or background) then
7: Classify the foreground and background pixels
based on pixel values.
8: end if
9: end for
10: Remove single-pixel artifacts [16] along the text
stroke boundaries after the document thresholding.
11: Store the new binary result to ‘Bf ‘.
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
19
ISBN:378-26-138420-0225
D. Parameter Selection
In the first experiment, we apply different γ
to obtain different power functions and test their
performance. α is close to 1 when γ is small and the
local image contrast Da dominates the adaptive image
contrast Da in Equation 3. On the other hand, Da is
mainly influenced by local image gradient when γ is
large. At the same time, the variation of α for
different document images increases when γ is close
to 1. Under such circumstance, the power function
becomes more sensitive to the global image intensity
variation and appropriate weights can be assigned to
images with different characteristics.
The proposed method can assign more suitable α to
different images when γ is closer to 1. Parameter γ
should therefore be set around 1 when the
adaptability of the proposed technique is maximized
and better and more robust binarization results can be
derived from different kinds of degraded document
images.
III. RESULTS
This section evaluates the results for
proposed document image binarization techniques.
Given a degraded document image, an adaptive
contrast map is first constructed. The text is then
segmented based on the local threshold that is
estimated from the detected text stroke edge pixels.
Some post-processing is further applied to improve
the document binarization quality.
Example 1
Fig.4 input degraded document image having ink bleeding through
effect
Fig.5 Contrast image constructed based on proposed adaptive local
contrast map
Fig.6 Binarized resultant image constructed based on proposed
local thresholding and post processing.
Example 2
Fig.7 input degraded document image having ink bleeding through
effect
Fig.8 Contrast image constructed based on proposed adaptive local
contrast map
Fig.9 Binarized resultant image constructed based on proposed
local thresholding and post processing.
IV. COMPARISON OF DIFFERENT
BINARIZATION METHODS
In this experiment, we quantitatively
compare our proposed method with Otsu’s method
(OTSU) [2], Sauvola’s method (SAUV) [12],
Niblack’s method (NIBL) [13], Bernsen’s method
(BERN) [8], Gatos et al.’s method (GATO) [17], and
LMM method (LMM [15], BE [16]). These are
composed of the same series of document images that
suffer from several common document degradations
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
20
ISBN:378-26-138420-0226
such as smear, smudge, bleed-through and low
contrast.
Example 1
Fig. 10. Binarization results of the sample document image in Fig.
1(b) produced by different methods. (a) OTSU [2]. (b) SAUV [12].
(c) NIBL [13]. (d) BERN [8]. (e) GATO [17]. (f) LMM [15]. (g)
BE [16]. (h) Proposed.
Example 2
Fig. 11. Binarization results of the sample document image (PR
06) in DIBCO 2011 dataset produced by different methods. (a)
Input Image. (b) OTSU [2]. (c) SAUV [12]. (d) NIBL [13]. (e)
BERN [8]. (f) GATO [17]. (g) LMM [15]. (h) BE [16]. (i) LELO
[18]. (j) SNUS. (k) HOWE [19]. (l) Proposed.
V. CONCLUSION
This paper presents a simple and robust
method of enhancing degraded document images.
The method proposed in this paper constitutes
binarization that is tolerant to different types of
document degradation such as non-uniform
illumination, ink bleeding through and document
smear. This image binarization is based on local
thresholding along with adaptive contrast mapping.
The proposed method has been tested over various
noise affected document images and is binarized
efficaciously. But we observed that the performance
on Bickley diary dataset needs to be improved, we
will explore it in future.
REFERENCES
[1] R. C. Gonzalez and R. E. Woods, Digital Image
Processing, Pearson prentice Hall, 2005.
[2] N. Otsu, “A threshold selection method from
gray– level histogram,” IEEE Transactions on System
Man Cybernatics, Vol. SMC-9, No.1, pp. 62-66,
1979.
[3] P.K. Sahoo, S. Soltani, A.K.C. Wong, and Y.
Chen, “A survey of thresholding techniques,”
Computer Vision Graphics Image Processing, Vol.
41, 1988, pp. 233 – 260.
[4] A. Brink, “Thresholding of digital images using
two-dimensional entropies,” Pattern Recognit., vol.
25, no. 8, pp. 803–808, 1992.
[5] J. Kittler and J. Illingworth, “On threshold
selection using clustering criteria,” IEEE Trans. Syst.,
Man, Cybern., vol. 15, no. 5, pp. 652–655, Sep.–Oct.
1985.
[6] N. Otsu, “A threshold selection method from gray
level histogram,” IEEE Trans. Syst., Man, Cybern.,
vol. 19, no. 1, pp. 62–66, Jan. 1979.
[7] N. Papamarkos and B. Gatos, “A new approach
for multithreshold selection,” Comput. Vis. Graph.
Image Process., vol. 56, no. 5, pp. 357–370, 1994.
[8] J. Bernsen, “Dynamic thresholding of gray-level
images,” in Proc. Int. Conf. Pattern Recognit., Oct.
1986, pp. 1251–1255.
[9] L. Eikvil, T. Taxt, and K. Moen, “A fast adaptive
method for binarization of document images,” in
Proc. Int. Conf. Document Anal. Recognit., Sep.
1991, pp. 435–443.
[10] I.-K. Kim, D.-W. Jung, and R.-H. Park,
“Document image binarization based on topographic
analysis using a water flow model,” Pattern
Recognit., vol. 35, no. 1, pp. 265–277, 2002.
[11] J. Parker, C. Jennings, and A. Salkauskas,
“Thresholding using an illumination model,” in Proc.
Int. Conf. Doc. Anal. Recognit., Oct. 1993, pp. 270–
273.
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
21
ISBN:378-26-138420-0227
[12] J. Sauvola and M. Pietikainen, “Adaptive
document image binarization,” Pattern Recognit.,
vol. 33, no. 2, pp. 225–236, 2000.
[13] W. Niblack, An Introduction to Digital Image
Processing. Englewood Cliffs, NJ: Prentice-Hall,
1986.
[14] J.-D. Yang, Y.-S. Chen, and W.-H. Hsu,
“Adaptive thresholding algorithm and its hardware
implementation,” Pattern Recognit. Lett., vol. 15, no.
2, pp. 141–150, 1994.
[15] B. Su, S. Lu, and C. L. Tan, “Binarization of
historical handwritten document images using local
maximum and minimum filter,” in Proc. Int.
Workshop Document Anal. Syst., Jun. 2010, pp. 159–
166.
[16] S. Lu, B. Su, and C. L. Tan, “Document image
binarization using background estimation and stroke
edges,” Int. J. Document Anal. Recognit., vol. 13, no.
4, pp. 303–314, Dec. 2010.
[17] B. Gatos, I. Pratikakis, and S. Perantonis,
“Adaptive degraded document image binarization,”
Pattern Recognit., vol. 39, no. 3, pp. 317–327, 2006.
[18] T. Lelore and F. Bouchara, “Super resolved
binarization of text based on the fair algorithm,” in
Proc. Int. Conf. Document Anal. Recognit., Sep.
2011, pp. 839–843.
[19] N. Howe, “A Laplacian energy for document
binarization,” in Proc. Int. Conf. Doc. Anal.
Recognit., Sep. 2011, pp. 6–10.
Dr.P.V.RamaRaju
working as a Professor at
the Department of
Electronics and
Communication
Engineering, S.R.K.R.
Engineering College,
AP, India.
His research interests include Biomedical-Signal
Processing, Signal Processing, VLSI Design and
Microwave Anechoic Chambers Design. He is author
of several research studies published in national and
international journals and conference proceedings.
Nagaraju.G
presently working as
Assistant Professor at
the Department of
Electronics and
Communication
Engineering, S.R.K.R.
Engineering College,
Bhimavaram, AP,
He received the B.Tech degree from S.R.K.R.
Engineering College, Bhimavaram in 2002, and
M. Tech degree in Computer electronics
specialization from Govt. College of engg, Pune
university in 2004. His current research interests
include Image processing, digital security
systems, Biomedical-Signal Processing, Signal
Processing, and VLSI Design.
V.RAJASEKHAR
received the B-tech
degree in Electronics
and communication
engineering from Sri
Vasavi Engineering
college,Tadepalligudem
, A.P, India, in 2011.
He is currently pursuing the M.Tech degree in
Communication Systems from S.R.K.R.
Engineering college, Bhimavaram .
INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014
INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in
22
ISBN:378-26-138420-0228

Iaetsd degraded document image enhancing in

  • 1.
    Degraded Document ImageEnhancing in Spatial Domain using Adaptive Contrasting and Thresholding Dr. P V Ramaraju, Professor Department of ECE, SRKR Engineering College, Bhimavaram, India. pvrraju50@gmail.com G.Nagaraju, Asst.Professor, Department of ECE, SRKR Engineering College, Bhimavaram, India. bhanu.raj.nikhil@gmail.com V.Rajasekhar, M.Tech Student Department of ECE, SRKR Engineering College, Bhimavaram, India. vrm.rajasekhar@gmail.com Abstract: This paper presents a new adaptive approach for the binarization and enhancement of degraded document images. The proposed method can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, ink bleeding-through, smear and strain. We follow several distinct steps in the proposed technique; an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized by using local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. Some post-processing is further applied to improve the document binarization quality. The proposed method is simple, robust, and involves minimum parameter tuning. Keywords--Adaptive contrast map, document image enhancing, Adaptive thresholding, degraded document image binarization, pixel classification. I. INTRODUCTION Robust binarization gives the possibility of a correct extraction of the sketched line drawing or text from its background. For the binarization of images many algorithms have been implemented. Thresholding is a sufficiently accurate and high processing speed segmentation approach to monochrome image. This paper describes a modified logical thresholding method for binarization of seriously degraded and very poor quality gray-scale document images. In general there are two types of image thresholding techniques available: global and local. In the global thresholding technique a gray level image is converted into a binary image based on an image intensity value called global threshold which is fixed in the whole image domain whereas in local thresholding technique, threshold value can vary from one pixel location to next. Thus, global thresholding converts an input image I to a binary image G as follows G(i, j) = 1 for I (i, j) ≥ T, or, G(i, j) = 0 for I (i, j) < T, where T is the threshold, G (i, j) = 1 for foreground and G (i, j) = 0 for background. Whereas, for a local threshold, the threshold T is a function over the image domain, i.e.,T= T(x, y). In addition, if in constructing the threshold value/surface the algorithm adapts itself to the image intensity values, then it is called dynamic or adaptive threshold. In a general setting, thresholding can be expressed as a test operation that tests against a function T of the form [1]: T = T[x , y, h, I ] , where, I is the input image and h denotes some local property of this point– for example, the average gray level of a neighborhood centered on (x, y). Threshold selection depends on the information available in the gray level histogram of the image. We know that an image function I(x, y) can be expressed as the product of a reflectance function and an illumination function based on a simple image formation model. If the illumination component of the image is uniform then the gray level histogram of the image is clearly bimodal, because the gray levels of object pixels are significantly different from the gray levels of the background. It indicates that one mode is populated from object pixels and the other mode is populated from background pixels. Then objects could be easily partitioned by placing a single global threshold at the neck or valley at the histogram. However, in reality bimodality in histograms does not occur very often. Consequently, a fixed threshold level based on the information of the gray level histogram will fail totally to separate objects from the background. In this scenario we turn our attention to adaptive local threshold surface where threshold value changes over the image domain to fit the spatially changing background and lighting conditions. Over the years many threshold selection methods have been proposed. Otsu has suggested a global image thresholding technique where the INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 17 ISBN:378-26-138420-0223
  • 2.
    optimal global thresholdvalue is ascertained by maximizing the between-class variance with an exhaustive search [2]. Sahoo et al. [3] claim that Otsu’s method is suitable for real world applications with regard to uniformity and shape measures. Though Otsu’s method is one of the most popular methods for global thresholding, it does not work well for many real world images where a significant overlap exists in the gray level histogram between the pixel intensity values of the objects and the background due to un-even and poor illumination. As many degraded documents do not have a clear bimodal pattern, global thresholding [4]–[7] is usually not a suitable approach for the degraded document binarization. Adaptive thresholding [8]– [14], which estimates a local threshold for each document image pixel, is often a better approach to deal with different variations within degraded document images. For example, the early window- based adaptive thresholding techniques [12], [13] estimate the local threshold by using the mean and the standard variation of image pixels within a local neighborhood window. The local image contrast and the local image gradient are very useful features for segmenting the text from the document background because the document text usually has certain image contrast to the neighboring document background. The image gradient is defined as follows G(x,y)=fmax(x, y) − fmin(x,y ) 1 The Local contrast is defined as follows max min max min ( , ) ( , ) ( , ) 2 ( , ) ( , ) x y x y D x y x y x y f f f f ε − = → + + Where ε is a positive but infinitely small number that is added in case the local maximum is equal to 0. II. PROPOSED METHOD This section describes the proposed document image binarization techniques. Given a degraded document image, an adaptive contrast map is first constructed. The text is then segmented based on the local threshold that is estimated from the detected text stroke edge pixels. Some post processing is further applied to improve the document binarization quality. A. Contrast Image Construction The image contrast in Equation 2 has one typical limitation that it may not handle document images with the bright text properly. This is because a weak contrast will be calculated for stroke edges of the bright text where the denominator in Equation 2 will be large but the numerator will be small. To overcome this over-normalization problem, we combine the local image contrast with the local image gradient and derive an adaptive local image contrast as follows max min( , ) ( , ) (1 )( ( , ) ( , )) 3aD x y D x y f x y f x yα α= + − − → Where D(x, y) denotes the local contrast in Equation 2 and (fmax(x, y) − fmin(x,y )) refers to the local image gradient that is normalized to [0, 1]. The local windows size is set to 3 empirically. α is the weight between local contrast and local gradient that is controlled based on the document image statistical information. Ideally, the image contrast will be assigned with a high weight (i.e. large α) when the document image has significant intensity variation. So that the proposed binarization technique depends more on the local image contrast that can capture the intensity variation well and hence produce good results. Otherwise, the local image gradient will be assigned with a high weight. We model the mapping from document image intensity variation to α by a power function as follows α= (Std/128)γ 4 Where Std denotes the document image intensity standard deviation, and γ is a pre-defined parameter. The power function has a nice property in that it monotonically and smoothly increases from 0 to 1 and its shape can be easily controlled by different γ .γ can be selected from [0,∞], where the power function becomes a linear function when γ = 1. Therefore, the local image gradient will play the major role in Equation 3 when γ is large and the local image contrast will play the major role when γ is small. The setting of parameter γ will be discussed in the section of parameter selection. Fig. 1 INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 18 ISBN:378-26-138420-0224
  • 3.
    Fig. 2 (a) (b) (c) Fig. 3.Contrast Images constructed using (a) local image gradient, (b) local image contrast [15], and (c) our proposed method for the original sample document images which are shown in Fig. 1 and 2, respectively. Fig. 3 shows the contrast map of the sample document images in Fig. 1 and 2 that are created by using local image gradient, local image contrast [15] and our proposed method in Equation 3, respectively. B. Local Threshold Estimation The text can then be extracted from the document background pixels once the high contrast stroke edge pixels are detected properly. Two characteristics can be observed from different kinds of document images [15]: First, the text pixels are close to the detected text stroke edge pixels. Second, there is a distinct intensity difference between the high contrast stroke edge pixels and the surrounding background pixels. The document image text can thus be extracted based on the detected text stroke edge pixels as follows {1.. min&& ( , ) /2 0..( , ) 5Ne N I x y Emean Estd otherwiseR x y ≥ ≤ + = → Where Emean and Estd are the mean and the standard deviation of the image intensity of the detected high contrast image pixels (within the original document image) within the neighborhood window that can be evaluated as follows mean ( , )*(1 ( , )) E 6 neighbor I x y E x y Ne − = → ∑ 2 (( ( , ) )*(1 ( , ))) E 7 2 neighbor std I x y Emean E x y− − = → ∑ The size of the neighborhood window W can be set based on the stroke width of the document image under study. C. Post-Processing Once the initial binarization result is derived from Equation 5 as described in previous subsections, the binarization result can be further improved by incorporating certain domain knowledge as described in Algorithm 1. First, the isolated foreground pixels that do not connect with other foreground pixels are filtered out to make the edge pixel set precisely. Second, the neighborhood pixel pair that lies on symmetric sides of a text stroke edge pixel should belong to different classes (i.e., either the document background or the foreground text). One pixel of the pixel pair is therefore labeled to the other category if both of the two pixels belong to the same class. Finally, some single-pixel artifacts along the text stroke boundaries are filtered out by using several logical operators as described in[16]. Algorithm 1 Post-Processing Procedure Require: The Input grayscale Document Image ‘I’, Initial Binary Result ‘B’ and Corresponding Binary Text Stroke Edge Image ‘Edge’ Ensure: The Final Binary Result ‘Bf’ 1: Obtain the connect components of the stroke edge pixels in ‘Edge’. 2: Take out those pixels that do not connect with other pixels. 3: For removing isolated pixels, we need to check connectivity. 4: for Each remaining edge pixels (i, j ): do 5: Get its neighborhood pairs: (i − 1, j) & (i + 1, j); (i, j − 1) &(i, j + 1) 6: if The pixels in the pairs belong to the same class (both text or background) then 7: Classify the foreground and background pixels based on pixel values. 8: end if 9: end for 10: Remove single-pixel artifacts [16] along the text stroke boundaries after the document thresholding. 11: Store the new binary result to ‘Bf ‘. INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 19 ISBN:378-26-138420-0225
  • 4.
    D. Parameter Selection Inthe first experiment, we apply different γ to obtain different power functions and test their performance. α is close to 1 when γ is small and the local image contrast Da dominates the adaptive image contrast Da in Equation 3. On the other hand, Da is mainly influenced by local image gradient when γ is large. At the same time, the variation of α for different document images increases when γ is close to 1. Under such circumstance, the power function becomes more sensitive to the global image intensity variation and appropriate weights can be assigned to images with different characteristics. The proposed method can assign more suitable α to different images when γ is closer to 1. Parameter γ should therefore be set around 1 when the adaptability of the proposed technique is maximized and better and more robust binarization results can be derived from different kinds of degraded document images. III. RESULTS This section evaluates the results for proposed document image binarization techniques. Given a degraded document image, an adaptive contrast map is first constructed. The text is then segmented based on the local threshold that is estimated from the detected text stroke edge pixels. Some post-processing is further applied to improve the document binarization quality. Example 1 Fig.4 input degraded document image having ink bleeding through effect Fig.5 Contrast image constructed based on proposed adaptive local contrast map Fig.6 Binarized resultant image constructed based on proposed local thresholding and post processing. Example 2 Fig.7 input degraded document image having ink bleeding through effect Fig.8 Contrast image constructed based on proposed adaptive local contrast map Fig.9 Binarized resultant image constructed based on proposed local thresholding and post processing. IV. COMPARISON OF DIFFERENT BINARIZATION METHODS In this experiment, we quantitatively compare our proposed method with Otsu’s method (OTSU) [2], Sauvola’s method (SAUV) [12], Niblack’s method (NIBL) [13], Bernsen’s method (BERN) [8], Gatos et al.’s method (GATO) [17], and LMM method (LMM [15], BE [16]). These are composed of the same series of document images that suffer from several common document degradations INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 20 ISBN:378-26-138420-0226
  • 5.
    such as smear,smudge, bleed-through and low contrast. Example 1 Fig. 10. Binarization results of the sample document image in Fig. 1(b) produced by different methods. (a) OTSU [2]. (b) SAUV [12]. (c) NIBL [13]. (d) BERN [8]. (e) GATO [17]. (f) LMM [15]. (g) BE [16]. (h) Proposed. Example 2 Fig. 11. Binarization results of the sample document image (PR 06) in DIBCO 2011 dataset produced by different methods. (a) Input Image. (b) OTSU [2]. (c) SAUV [12]. (d) NIBL [13]. (e) BERN [8]. (f) GATO [17]. (g) LMM [15]. (h) BE [16]. (i) LELO [18]. (j) SNUS. (k) HOWE [19]. (l) Proposed. V. CONCLUSION This paper presents a simple and robust method of enhancing degraded document images. The method proposed in this paper constitutes binarization that is tolerant to different types of document degradation such as non-uniform illumination, ink bleeding through and document smear. This image binarization is based on local thresholding along with adaptive contrast mapping. The proposed method has been tested over various noise affected document images and is binarized efficaciously. But we observed that the performance on Bickley diary dataset needs to be improved, we will explore it in future. REFERENCES [1] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Pearson prentice Hall, 2005. [2] N. Otsu, “A threshold selection method from gray– level histogram,” IEEE Transactions on System Man Cybernatics, Vol. SMC-9, No.1, pp. 62-66, 1979. [3] P.K. Sahoo, S. Soltani, A.K.C. Wong, and Y. Chen, “A survey of thresholding techniques,” Computer Vision Graphics Image Processing, Vol. 41, 1988, pp. 233 – 260. [4] A. Brink, “Thresholding of digital images using two-dimensional entropies,” Pattern Recognit., vol. 25, no. 8, pp. 803–808, 1992. [5] J. Kittler and J. Illingworth, “On threshold selection using clustering criteria,” IEEE Trans. Syst., Man, Cybern., vol. 15, no. 5, pp. 652–655, Sep.–Oct. 1985. [6] N. Otsu, “A threshold selection method from gray level histogram,” IEEE Trans. Syst., Man, Cybern., vol. 19, no. 1, pp. 62–66, Jan. 1979. [7] N. Papamarkos and B. Gatos, “A new approach for multithreshold selection,” Comput. Vis. Graph. Image Process., vol. 56, no. 5, pp. 357–370, 1994. [8] J. Bernsen, “Dynamic thresholding of gray-level images,” in Proc. Int. Conf. Pattern Recognit., Oct. 1986, pp. 1251–1255. [9] L. Eikvil, T. Taxt, and K. Moen, “A fast adaptive method for binarization of document images,” in Proc. Int. Conf. Document Anal. Recognit., Sep. 1991, pp. 435–443. [10] I.-K. Kim, D.-W. Jung, and R.-H. Park, “Document image binarization based on topographic analysis using a water flow model,” Pattern Recognit., vol. 35, no. 1, pp. 265–277, 2002. [11] J. Parker, C. Jennings, and A. Salkauskas, “Thresholding using an illumination model,” in Proc. Int. Conf. Doc. Anal. Recognit., Oct. 1993, pp. 270– 273. INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 21 ISBN:378-26-138420-0227
  • 6.
    [12] J. Sauvolaand M. Pietikainen, “Adaptive document image binarization,” Pattern Recognit., vol. 33, no. 2, pp. 225–236, 2000. [13] W. Niblack, An Introduction to Digital Image Processing. Englewood Cliffs, NJ: Prentice-Hall, 1986. [14] J.-D. Yang, Y.-S. Chen, and W.-H. Hsu, “Adaptive thresholding algorithm and its hardware implementation,” Pattern Recognit. Lett., vol. 15, no. 2, pp. 141–150, 1994. [15] B. Su, S. Lu, and C. L. Tan, “Binarization of historical handwritten document images using local maximum and minimum filter,” in Proc. Int. Workshop Document Anal. Syst., Jun. 2010, pp. 159– 166. [16] S. Lu, B. Su, and C. L. Tan, “Document image binarization using background estimation and stroke edges,” Int. J. Document Anal. Recognit., vol. 13, no. 4, pp. 303–314, Dec. 2010. [17] B. Gatos, I. Pratikakis, and S. Perantonis, “Adaptive degraded document image binarization,” Pattern Recognit., vol. 39, no. 3, pp. 317–327, 2006. [18] T. Lelore and F. Bouchara, “Super resolved binarization of text based on the fair algorithm,” in Proc. Int. Conf. Document Anal. Recognit., Sep. 2011, pp. 839–843. [19] N. Howe, “A Laplacian energy for document binarization,” in Proc. Int. Conf. Doc. Anal. Recognit., Sep. 2011, pp. 6–10. Dr.P.V.RamaRaju working as a Professor at the Department of Electronics and Communication Engineering, S.R.K.R. Engineering College, AP, India. His research interests include Biomedical-Signal Processing, Signal Processing, VLSI Design and Microwave Anechoic Chambers Design. He is author of several research studies published in national and international journals and conference proceedings. Nagaraju.G presently working as Assistant Professor at the Department of Electronics and Communication Engineering, S.R.K.R. Engineering College, Bhimavaram, AP, He received the B.Tech degree from S.R.K.R. Engineering College, Bhimavaram in 2002, and M. Tech degree in Computer electronics specialization from Govt. College of engg, Pune university in 2004. His current research interests include Image processing, digital security systems, Biomedical-Signal Processing, Signal Processing, and VLSI Design. V.RAJASEKHAR received the B-tech degree in Electronics and communication engineering from Sri Vasavi Engineering college,Tadepalligudem , A.P, India, in 2011. He is currently pursuing the M.Tech degree in Communication Systems from S.R.K.R. Engineering college, Bhimavaram . INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 22 ISBN:378-26-138420-0228