1. Machine Printed Handwritten Text Discrimination
Using Radon Transform and SVM Classifier
ET-Tahir Zemouri1 and Youcef Chibani 2
Signal Processing Laboratory, Faculty of Electronic and Computer Sciences
University of Sciences and Technology Houari Boumediene
USTHB, EL-Alia, B.P. 32, 16111, Algiers, Algeria
1
tzemouri @usthb.dz, 2 ychibani@usthb.dz
Abstract—Discrimination of machine printed and handwritten lines in Bangla script. Guo and Ma [3] proposed
handwritten text is deemed as major problem in the an approach based on the vertical projection profile of the
recognition of the mixed texts. In this paper, we address the segmented words, which used a Hidden Markov Model
problem of identifying each type by using the Radon transform (HMM) as the classifier. Zheng et al. [4] reported on printed
and Support Vector Machines, which is conducted at three and handwritten text segmentation using k-NN, Support
steps: preprocessing, feature generation and classification. New Vector Machines (SVM) and Fisher classifier with features
set of features is generated from each word using the Radon like pixel density, aspect ratio and Gabor features. Kandan et
transform. Classification is used to distinguish printed text al. [5] used invariant moments, which are insensitive to
from handwritten. The proposed system is tested on IAM translation, scale, mirroring and rotation as the feature for
databases. The recognition rate of the proposed method is distinguishing the printed and handwritten elements and the
calculated to be over 98%.
SVM classifier.
We propose in this paper a new method for text
Keywords-document analysis; machine printed and discrimination by using the Radon transform and Support
handwritten text discrimination; Radon transform; Support
Vector Machines.
Vector Machines (SVM).
The Radon transform is adapted for detecting linear
features. Hence, printed words generate Radon coefficients
I. INTRODUCTION more regular comparatively to handwritten words. This
Machine printed and handwritten text are often met in property can be used for distinguishing between printed and
application forms, question papers, mail as well as notes, handwritten words. While, the SVM is well adapted for a
corrections and instructions in printed documents. robust separation of two classes.
In all mentioned cases it is crucial to detect, distinguish The paper is organized as follows. In section 2, we
and process differently the areas of handwritten and printed describe the proposed system. Experiments and conclusions
text (OCR for machine printed text and ICR for handwritten are discussed in Sections 3 and 4, respectively.
annotations) for obvious reasons such as: (a) retrieval of
important information (identification of handwriting in II. THE PROPOSED SYSTEM
application forms), (b) removal of unnecessary information The system for the discrimination between machine
(removal of handwritten notes from official documents), and printed and handwritten text can be decomposed into three
(c) application of different recognition algorithms in each stages [1], as shown in Fig. 1. The first stage is the
case. preprocessing stage, in which the document is cleaned of all
The main difference between machine printed and the noise components present such as spurious dots and
handwritten text is their shape structure. Characters in lines. In the second stage, features are generated based on
machine printed text have a uniform shape. Whereas Radon transform, for which the elements are classified into
handwritten text are of arbitrary curly allograph styles. This printed or handwritten using SVM classifiers.
difference can be exploited for generating features by
exploring the regularity of the machine printed words A. Preprocessing stage
comparatively of the handwritten words. Due to large variations in image data, preprocessing,
There exist a few papers on the discrimination of which is used to reduce variations and produce a more
machine printed and handwritten text. Kuhnke et al. [1] consistent set of data, is essential for accurate character
proposed a neural network-based approach with straightness recognition. In our system, preprocessing includes the
and symmetry as features. Pal and Chaudhuri [2] have used filtering, binarization, skew angle correction, smoothing, and
horizontal projection profiles for separating the printed and word segmentation.
2. characters is more or less stable within a text word. On the
Document image other hand, the distribution of the shape of handwritten
characters is quite diverse.
The Radon transform has been used in many pattern
Preprocessing recognition applications as shape recognition [11]. In our
approach, the Radon transform is used as a tool for
Filtering Binarization Skew correction generating a feature vector. Hence, we briefly review its
main properties.
1) Radon Transform
Segmentation Smoothing
The Radon transform computes projections of an image
along specified directions. A projection of a two-dimensional
function I ( x, y ) is a set of line integrals. The Radon
Feature generation transform computes the line integrals from multiple sources
along parallel paths in a certain direction. To represent an
image, the Radon transform takes multiple and parallel
Classification projections of the image from different angles by rotating the
source around the center of the image. Formally, the Radon
transform of an image is defined as [12]:
Machine printed Handwritten
TRI ( ρ ,θ ) = ∫x ∫y I ( x, y )δ ( x cosθ + y sin θ − ρ )dxdy (1)
Figure 1. Block-diagram of the classification system.
1) Image filtering: Generally, the image acquired from where δ is the Dirac function, θ ∈]0,180°] and
a scanner contains the noise, which can be reduced using a ρ ∈] - ∞,+∞] . In other words, TRI is the integral of I ( x, y )
3x3 Wiener filter [6].
over the line defined by ρ = x cos θ + y sin θ .
2) Binarization: the text is separated from background
The Radon transform has several useful properties, as
by automatic thresholding. The Wolf approach [7] is used to
periodicity, symmetry, translation invariance, rotation
the binary image. invariance and scaling invariance.
3) Skew angle correction: The skew estimation and In our approach, we only are interested on periodicity
correction is an important step in any document analysis and and symmetry. Fig. 2 shows an example of the Radon
recognition system. Hence, we use the projection profile for transform computed on the printed and handwritten words.
estimating the skew angle [8], which can be performed for
different angles and the largest magnitude variations
correspond to the skew angle.
4) Smoothing: For smoothing binary document images,
four filters [9] can be used to smooth the edges and removing
the small pieces of noise.
5) Segmentation: Segmentation aims to extract the
words from the document. Segmentation is performed in two
consecutive steps: line segmentation and word segmentation.
Both steps make use of the projection profiles [10].
B. Feature Generation
Many kinds of features can be generated for distinguish
the printed from handwritten text, Kuhnke et al. [1] proposed
a straightness of vertically/horizontally oriented lines and
symmetry relative to different points as features. Pal and
Chaudhuri [2] used the distinctive structural and statistical
features. Guo and Ma [3] evaluated their scheme using the
vertical projection profile. Zheng et al. [4] used features like (a) (b)
Gabor filter, Run length histogram features etc. Kandan et al. Figure 2. A shape (a) and its Radon transform (b).
[5] used the invariant moments that are invariant under
translation, scaling, rotation and reflection. We can easily see that the Radon transform generates
The main idea of our approach is to take advantage of the more coefficients of the handwritten word comparatively to
structural properties that help to discriminate printed from the printed word.
handwritten text. More precisely, the shape of the printed
3. 2) Feature vector generation We can see that the energy based-Radon transform
To generate features of printed and handwritten words, generates more energy of the handwritten word
we fix the angular direction number denoted by Nθ comparatively to the printed word.
( θ ∈]0,360°] ). Since, the Radon transform generates 3) Feature vector normalization
redundant coefficients (Fig 2.b), hence, in our approach, we In many practical situations, a designer is confronted
select the positive radial projections and taking all directions with features whose values lie within different dynamic
from 0 to 360°. The feature vector is then generated by ranges. Thus, features with large values may have a larger
computing for a given column in positive space of the Radon influence in the cost function than features with small values,
transform, the sum of the square coefficient by setting the although this does not necessarily reflect their respective
number of angular direction Nθ . The feature values E I (θ ) significance in the design of the classifier. The problem is
are defined as: overcome by normalizing the features so that their values lie
within similar ranges. This is achieved by using nonlinear
transformation [13].
1
E I (θ ) = ∑ N ρ TR ( ρ ,θ )
I 2
(2)
Nρ C. Classification
SVM are supervised learning methods, which have been
Fig. 3 illustrates an example of feature generation values widely and successfully used for pattern recognition in
which include the Radon transform energy for each angle θ . different applications as digit recognition [14]. The main
concept of SVM lies to find a hyperplane that allows
separating two classes, leaving the largest margin between
the vectors of the two classes [14]. However, in real life,
problems can be linearly non separable. To deal with this
problem, a nonlinear decision surface is obtained by lifting
the feature space into a higher dimensional space. A linear
separating hyperplane is found in the higher dimensional
space that gives a nonlinear decision surface in the original
feature space. The decision function of the SVM can be
expressed as follows:
(a)
f ( x) = ∑ α i yi K ( x, xi ) + b (3)
i
Where ( xi , yi ) ∈ ℜ d X{± 1} are the feature vectors and
labels, respectively. In our case, the feature vectors and
labels correspond to the Radon energy {xi } , printed words
{+1} and handwritten words {-1}, respectively. Parameters
α i and b are found by maximizing a quadratic function
subject to some constraints [14]. K ( x, xi ) is the kernel
function, which allows mapping the feature vectors into a
(b) higher dimension inner product space. In our case, we use
the RBF kernel (Radial Function Basis) since it offers better
discrimination than other kernels. The RBF kernel is defined
as:
d ( x, xi )
K ( x, xi ) = exp(− ) (4)
2σ 2
2
d ( x, xi ) = x − xi (5)
σ is user defined.
(c)
The optimization algorithm adopted for training SVMs is
the Sequential Minimal Optimization (SMO) which provides
Figure 3. Feature vector generation, (a) Printed word and its Radon
transform, (b) handwritten word and its Radon transform, (b) Radon
practical advantages [15].
transform, (c) Radon energy versus angle.
4. III. EXPERIMENTAL RESULTS B. System validation
In order to validate our system various experiments are
A. Data set conducted for finding the SVM regularization parameter
For evaluating the performances of the proposed method, (fixed at 10), kernel parameter ( σ ) and the best angular
we use the IAM database (Institut für Informatik und direction number ( Nθ ). Fig. 5 shows the recognition rate
angewandte Mathematik) [16]. They are scanned with
resolution of 300 dpi, 8 bits/pixel, gray-scale and converted obtained on the validation set for each angular direction
into binary images using the Wolf binarization method. This number. We can note that the RR is not very sensitive to the
database is formed for more than 1500 documents containing number of the angular direction. However, the best
printed and handwritten text. An example of a document can performances (RR=77.06%) are obtained for Nθ =20 and
be seen in Fig. 4. Regions of printed and handwritten words σ =2.1.
are easily separable. They present no auxiliary lines to fill or
to supply with written texts. This characteristic facilitates the
identification and classification of each type of words.
For testing the performances of our system, 21 images
are chosen and preprocessed. The set of words are divided
into three subsets for training (1/3), validating (1/3) and
testing (1/3), respectively. Table 1 summaries the data set.
For each word, a vector with the energy based-Radon
Transform is calculated. We use the recognition rate (RR) as
a metric to evaluate the performances of our system, which is
defined as:
# of words correctly classified
RR = (%) (7) Figure 5. Recognition rate using Radon transform
# total of words for the system validation.
In order to improve the recognition rate, we add by
concatenation statistical features to the energy based-Radon
transform, which are mean, variance, variance of projection
profile (vertical and horizontal) and entropy. Fig. 6 shows
the recognition rate versus the number of the angular
direction.
Figure 6. Recognition rate using Radon transform and statistical features.
We can see that statistical feature sets are very suitable
Figure 4. IAM Database form. information for the discrimination between machine printed
and handwritten text since the RR has been improved to
92.8% for Nθ =10 and σ =2 using validation set. This
TABLE I. DATA SET
constitutes an additional advantage when adding the
Data set Training Validation Testing statistical features.
Machine printed 447 447 438 C. System testing
Handwritten 525 525 484 After the validation of the system, the testing set is used
Total 972 972 922 for evaluating its performances. Hence, the optimal values of
5. the system validation are used for computing the recognition [6] B. Gatos, I. Pratikakis and S. J. Perantonis, “Adaptive degraded
rate. The obtained results are 98.32%, which constitutes document image binarization,” Pattern Recognition, vol. 39, pp. 317-
327, 2006.
encouraging performances compared to other works [1-5].
[7] C. Wolf, and J.M. Jolion, “Extraction and recognition of artificial text
D. Comparaison with other similar works in multimedia documents,” Pattern Analysis and Applications, vol. 6,
n. 4, pp. 309-326, 2003.
We compare our results with some other published [8] T. Akiyama, and N. Hagita, “Automatic entry system for printed
research works in terms of RR. Hence, Kuhnke et al. [1] documents,” Pattern Recognition, vol. 23, n. 11, pp. 1141-1154, 1990.
proposed a neural network-based approach with straightness [9] M. Cheriet, N. Kharma, C. L. Liu, and C. Suen, “Character
of vertically/horizontally oriented lines and symmetry Recognition Systems: A Guide for Students and Practitioners,”
relative to different points as features. The system reached a Wiley-Interscience editor, p 321, 2007.
RR of 78.5%. Pal and Chaudhuri [2] approach based on the [10] E. Ataer, and P. Duygulu, “Retrieval of Ottoman Documents,” Proc
8th ACM international workshop on Multimedia information
distinctive structural and statistical features of machine retrieval, pp. 155-162, 2006.
printed and handwritten text lines in Bangla script. The [11] S. Tabbone ,L. Wendling, and J. P. Salmon, “A new shape descriptor
classification scheme has a RR of 98.3%. Guo and Ma [3] defined on the Radon transform,” Computer Vision and Image
evaluated their scheme using the vertical projection profile of Understanding, vol.102, n. 1, pp. 42–51, 2006.
the segmented word and obtained a 92.86% from their [12] S. R. Deans, “The Radon Transform and Some of Its Applications.
New York: Wiley, 1983.
scheme using HMM. Zheng et al. [4] got a RR of 96% using
[13] S. Theodoridis, and K. Koutroumbas, “Pattern Recognition,” 4th Ed,
SVM classifier and features like Gabor filter, Run length Elsevier Inc, 2009.
histogram features etc. Kandan et al. [5] obtained a RR of [14] H. Nemmour, Y. Chibani, “Handwritten digit recognition based on a
93.22% using the invariant moments that are invariant under neural-SVM combination”, Int journal of computers and applications
translation, scaling, rotation and reflection as features and (Acta Press Editor), vol. 32, n.1, pp. 104-109, 2010.
SVM classifier. [15] H. Nemmour, Y. Chibani, “Integrating class-dependant tangent
Our proposed method obtains a RR of 98.32% by using vectors into SVMs for handwritten digit recognition,” Int Conf on
Signals, Circuits and Systems (ICSCS), pp. 1-4, 2009.
Radon transform and statistical features and SVM classifier,
[16] U.V. Marti, and H. Bunke, “The IAM-Database: an english sentence
which constitutes encouraging performances compared to database for offline handwriting recognition,” International Journal
other works. on Document Analysis and Recognition, vol. 5, n. 1, pp. 39-46, 2002.
IV. CONCLUSION
In this paper, we proposed a new method for
discriminating printed and handwritten text in document
images using the Radon transform and SVM classifiers. The
system was implemented and tested in IAM databases.
Our approach presents encouraging results by combining
Radon energy and statistical features using SVM classifiers
with the RBF kernel.
In the future, we plane to implement our methodology to
distinguish machine printed/handwritten with Arabic and
Latin texts.
REFERENCES
[1] K. Kuhnke, L. Simoncini, and Z.M. Kovacs-V, “A System for
Machine-Written and Hand-Written Character Distinction,” Proc. 3rd
International Conference on Document Analysis and Recognition,
vol. 2, pp 811-814, 1995.
[2] U. Pal, and B. B. Chaudhuri, “Machine-printed and Hand-written
Text Line Identification,” Pattern Recognition Letters, vol. 22, n. 3-4,
pp. 431-441, 2001.
[3] J. K. Guo, and M. Y. Ma, “Separating Handwritten Material from
Machine Printed Text Using Hidden Markov Models,” Proc. 6th
International Conference on Document Analysis and Recognition, pp.
439-443, 2001.
[4] Y. Zheng, H. Li, and D. Doermann, “Machine Printed Text and
Handwriting Identification in Noisy Document Images,” IEEE Trans
on Pattern Analysis and Machine Intelligence, vol. 26, n. 3, pp. 337-
353, 2004.
[5] R. Kandan, N. K. Reddy, K. R. Arvind, and A. G. Ramakrishnan, “A
Robust Two Level Classification Algorithm for Text Localization in
Documents,” Advances in Visual Computing, 3rd Int Symp, (ISVC
07), Part II, LNCS 4842, pp. 96–105, 2007.