SlideShare a Scribd company logo
1 of 9
Download to read offline
Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
Vol. 2, No. 4, December 2014, pp. 161~169
ISSN: 2089-3272  161
Received June 16, 2014; Revised August 10, 2014; Accepted September 2, 2014
Segmentation of Handwritten Chinese Character
Strings Based on improved Algorithm Liu
Zhihua Cai, Jiangqing Wang*, Yangguang Sun
College of Computer Science, South-Central University for Nationalities, Wuhan 430074, China
*Corresponding author, e-mail: hunterpaper2199@gmail.com
Abstract
Algorithm Liu attracts high attention because of its high accuracy in segmentation of Japanese
postal address. But the disadvantages, such as complexity and difficult implementation of algorithm, etc.
have an adverse effect on its popularization and application. In this paper, the author applies the principles
of algorithm Liu to handwritten Chinese character segmentation according to the characteristics of the
handwritten Chinese characters, based on deeply study on algorithm Liu.In the same time, the author put
forward the judgment criterion of Segmentation block classification and adhering mode of the handwritten
Chinese characters.In the process of segmentation, text images are seen as the sequence made up of
Connected Components (CCs), while the connected components are made up of several horizontal
itinerary set of black pixels in image. The author determines whether these parts will be merged into
segmentation through analyzing connected components. And then the author does image segmentation
through adhering mode based on the analysis of outline edges. Finally cut the text images into character
segmentation. Experimental results show that the improved Algorithm Liu obtains high segmentation
accuracy and produces a satisfactory segmentation result.
Keywords: Character Strings segmentation, Character recognition, Connected components analysis,
Merged characters segmentation
1. Introduction
Character segmentation is one of the key technologies in text recognition rate. The
object of character segmentation can be divided into printed character, handwritten Chinese
character, handwritten western character, etc. It is need to choose different segmentation
methods for different object generally. The scholars and experts at home and abroad do a lot of
researches on segmentation of western character and numbers [1] and have made a lot of
valuable achievements. But Chinese character segmentation always focuses on printed
character. There are few researches on handwritten Chinese character segmentation, so the
corresponding achievements are also few. Because Handwritten Chinese character is different
from printed Chinese character. Briefly, printed character is mechanical segmentation of two
dimensional planes, which is standard and easy to recognize. While handwriting is lively, vivid
and writing model with high artistry, and handwriting pays attention to the shape, generalization
and change of stroke image. What’s more, there must be natural and fluent coherent and echo
between stroke and stroke. Therefore, there are conditions of script in handwriting Chinese
character and strokes are changing, as shown in Fig.1. There is certain tilt in the font in this
handwriting image and different spacing between line and line. Meanwhile, the space between
word and word is different. Some are written relatively tensely, while others are written relatively
sparse. These characteristics are all distinctive from the printed text. Therefore, the
segmentation of handwritten Chinese character is more complicated.
Certainly, there are some achievements which are worth noticing. A segmentation
method which is based on stroke segment drawing is proposed by Tseng L.Y., etc. [2]. They
adopts regulatory computing cost matrix of components merger based on knowledge, and then
obtain text image segmentation by using dynamic programming. Chen Hong et al. [3] make use
of the characteristics of width of characters and spacing, etc. to do character segmentation in
text through the principle of minimum variance. Zhao Yuming et al. [4] propose a handwritten
Chinese character segmentation method based on stroke extraction and merging. Zhao et al.
[5-6] propose two-stage segmentation method of coarse segmentation and fine segmentation.
Jiangqing Wang et al. [7] propose vertical projection characters segmentation based on
minimum threshold and curve-fitting. Zhao Shuyan etc. [8] propose merged handwritten
 ISSN: 2089-3272
IJEEI Vol. 2, No. 4, December 2014 : 161 – 169
162
Chinese character segmentation based on stroke analysis and background thinning. Cao Wei
[9] proposes segmentation of gap algorithm based on multi-threshold and multi- segmentation
strategy. Li Xiaoyuan etc. [10] propose merged handwritten Chinese character segmentation
based on structural cluster analysis and stroke analysis. These methods promote the
recognition of Chinese character to a certain extent, but there are still some shortcomings, such
as recognition accuracy and recognition speed, etc., because recognition of Chinese character
is under the influence of some objective factors, such as writing quality, writing style, font size,
and image quality of handwritten Chinese character on computer, etc. Algorithm Liu [11] is
proposed by Cheng-Lin Liu, Masashi Koga and Hiromichi Fujisawa. It is acquired with high
accuracy in Japanese postal address segmentation, but its algorithm is complex and its
implementation is difficulty, which have negative effects on its popularization and application.
In this paper, the author applies the principles of algorithm Liu to handwritten Chinese
character segmentation according to the characteristics of the handwritten Chinese characters,
based on deeply study on algorithm Liu.In the same time, the author put forward the judgment
criterion of Segmentation block classification and adhering mode of the handwritten Chinese
characters. In the segmentation process, the text line is considered as the sequence of CCs and
each CCs is comprised of a set of black runs. It is determined whether CCs is merged or not by
analysis of CCs. Then, the segmentation block are segmented into character segmentation
block by touching patterns segmentation based on analysis of partial contours. Experimental
results show that the new algorithm has a high precision rate.
Figure 1. The image of Handwritten Chinese character
2. Algorithm Liu
Liu algorithm is the method of character segmentation based on connected components
[11]. In the process of segmentation, text images are seen as the sequence made up of
Connected Components (CCs), while the connected components are made up of several
horizontal itinerary set of black pixels in image. And then analyze each connected component
recursion. The analysis process is that calculate the standard overlap. If the standard overlap of
the two parts is positive, this two parts can be merged. Calculating the standard overlap is
similar to calculating relevancy to the two parts. Only the standard overlap is high enough, and it
thinks that the two parts can be merged. The components which complete the merger are called
text segmentation. Because text segmentation after merging exist adhesion owing to
handwriting, that is, segmentation after merging contains two character, conducting adhering
mode segmentation based on outline edges analysis. Segmentation process is to do outline
analysis of text segmentation which is inconsistent with height-width ratio, and detect point of
division. If the point of division is not found, judge again whether text segmentation conforms to
height-width ratio of compulsory segmentation. If it meets, do compulsory segmentation of it. If
not, give up compulsory segmentation. If you could find the point of division, do segmentation
IJEEI ISSN: 2089-3272 
Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang)
163
on this text segmentation. Don’t do segmentation on those which are consistent with height-
width ratio. Finally, get the complete text segmentation. Algorithm Liu is effective for
segmentation of handwriting text image which exist adhering mode, and find point of division
and complete segmentation accurately.
This algorithm is a complete recognition system of postal address, including image
preprocessing, connected component analysis, adhering mode segmentation and a search
algorithm based on the classifier of character and the dictionary. The system flowchart is shown
in Figure 2:
Figure 2. The Character String Recognition (CSR) System
The whole system begins with a text image. Its writing direction is from left to right or
from top to bottom. The text image is presented by Connected Components (CCs). The system
will process text image in sequence until it can output an address phrases. The accuracy of
these 3589 actual email texted by the research group is almost to the point of 83.86% and the
error rate below 1%. Visibly, this algorithm is effective for the segmentation of Japanese postal
address.
3. The Segmentation of Handwritten Chinese Character Based on Algorithm Liu
3.1. Image Preprocessing
There may be some factors, such as fuzziness and alteration, etc. affecting handwritten
text, and the problems, such as un-clarity and noise, etc. in its image. Therefore, to estimate
stroke width of text image from Run Length Histogram of Connected Components in order to
control image filtering effect. Firstly, to find stroke width-- maxrl , which appears the most, that is,
the maximum probability from Run Length Histogram. But this value does not represent the
stroke width of handwriting text image, because there may be other noise or other information
contributing to this probability. Therefore, the mean value of studying run length histogram,
meanrl , is calculated by the formula 1 below:
 
 
1
1
int i
mean
i
i hist i
rl
hist i
 
  
  


(1)
Among them, int( ) is round off the mean value calculated. If max meanrl rl , maxSW rl ,
otherwise, find a value which satisfies the formula 2 between maxrl and meanrl .
 1 1 max
1
( )
2
localhist rl hist rl (2)
localSW rl ,or maxSW rl .
 ISSN: 2089-3272
IJEEI Vol. 2, No. 4, December 2014 : 161 – 169
164
Estimating stroke width of text through runs length histogram. All CCs units which are
smaller than SW SW in the black area are removed. If 3SW  , will not do any further
smoothing process on text image in order to prevent the fracture of thin lines; Otherwise, if
7SW  , the image will do morphological open handle by using the structure of5 5 ; otherwise,
if 5SW  , the image will do morphological open handle by using the structure of3 3 ; if it is no
less, the image will do median filter smoothing in the area of3 3 . Doing morphological open
handle is effective for eliminating filament and noise. The structure template used is shown in
Figure 3:
(a) Structure template of 3 3 (b) Structure template of 5 5
Figure 3. Structure template used by morphology
Stroke width ( SW ) is the key to get binary image. Once it cannot get correct value of
SW , it will affect the final result after entire algorithm processing. To do segmentation of
character in image, the wrong stroke width will delete or dispose of the black pixels contained in
the image, and then binary image obtained cannot reflect the text messages correctly, and of
course, it cannot complete text segmentation.
3.2. Connected Components Analysis
Handwritten text image is seen as a sequence of CCs, while each CCs is composed of
horizontal stroke set of several black pixels. Because CCs overlap is likely to constitute the
same character, calculate its standardized overlapping degree to decide whether to merge it.
Calculation process is shown below. Bounding box of components is appointed by coordinate of
left, right, upper and lower boundaries. The bounding box of the two parts are respectively ( 1
l
x ,
1
r
x , 1
t
y , 1
b
y ) and ( 2
l
x , 2
r
x , 2
t
y , 2
b
y ).
Suppose 1 2
l l
x x , if 2 1
l r
x x , the two parts overlap, the overlapping degree is 1 2
r l
ovlp x x  ,
span is  1 2max ,r r dist
span x x
span
  , nmovlp is calculated by formula 3 below.
1 2
1
2
ovlp ovlp dist
nmovlp
w w span
 
   
 
(3)
Among them, 1w and 2w respectively represent the width of two parts, dist represents
the horizontal distance of the center. If nmovlp , merge the two parts. As long as there is large
enough standardized overlapping degree, to merge the current parts and its subsequent parts.
Figure 4 shows diagrammatic sketch of merging two parts. 1O and 2O represent the center of
the two parts. Merger of components adopts recursive combined method. After recursive
merger, each part can be seen as image segmentation. And then detect the potential adhering
mode of segmentation and cut it. If you think segmentation is still likely to be multi-model after
segmentation is completed, you can force to cut the block according to its projection histogram.
IJEEI ISSN: 2089-3272 
Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang)
165
Figure 4. Arc strokes consisting of two segments
Connected components are the basic unit for analyzing the text segmentation. There
are script and broken in writing. According to the definition of the connected components in the
algorithm, connected components are identified as the set of a series of consecutive black pixel
point. All points which can be connected can form a connected component in the text image.
Here connected points are not only the points connecting up, down, left and right, but also the
points connecting the upper left, lower left, upper right and lower right, that is, eight connecting
area. Because characters are not written under the situation of “pure” horizontally and “pure”
vertically, but written with a certain angle. Determination of connected components is the
foundation of merging connected components. By adopting the method in this section, it could
merge connected components into segmentation block, and then do adhering mode analysis
next.
3.3. Adhering Mode Segmentation
1) The text segmentation block concluded from by the previous two steps does not
represent the final result of segmentation. Therefore, there may be “under-segmentation” in text
segmentation block at this time. So-called under-segmentation is that the two segmentation
blocks which should be independent merge because of the too close writing. This merger is not
the defect which algorithm itself design, but the writing. So, detect the adhering mode of
segmentation and find out the point of division, in order to separate the two segmentation blocks
which are under-segmentation.
First estimate the height of text line ( LH ) according to histogram of image
segmentation block after CCs merging. The segmentation block produced after connected
components analysis combining can be divided into big segmentation block and small
segmentation block. The standard of judging small segmentation block is determined by the
maximum frequency of the height of segmentation block calculated. The segmentation block
which is smaller than 1/2 is regarded as small segmentation block. The reason is that Chinese
character is relatively homogeneous. When the segmentation blocks have cut characters out
through combination of the connected components analysis, their height will be more uniform.
There are several punctuation marks whose height is small after being split out. But the number
of segmentation block of punctuation marks is relatively less than that of segmentation block of
text. Therefore, it could determine directly the height of small segmentation block according to
the height of the segmentation block with the largest frequency. Given the height of small
segmentation block had a little effect on character height, just merging segmentation block
temporarily, and then the height and width of temporary segmentation are respectively th and
tw. The histogram    2 2hist th hist th tw  is updated. The height of text line is regarded as an
estimated value of the height mean value of temporary block in the histogram. Suppose the
width of segmentation block is SW , the height is SH . If 1hSW LH  or 2/ hSW SH  , this
segmentation block is likely to be adhering mode.
The height-to-width of Chinese characters is generally stable because Chinese
characters are relatively homogeneous. Because there are few characters which are too wide
and with height within the text line height LH , or it is almost impossible to exist, it is likely to
merge multiple connected components into a text segmentation block for adhesion in the
 ISSN: 2089-3272
IJEEI Vol. 2, No. 4, December 2014 : 161 – 169
166
connected component analysis. It can determine the value of parameters, 1h and 2h , through a
lot of experiments. It is reasonable to think that there is adhesion in the text segmentation
blocks whose height-width ratios don’t conform to these two formulas. Use the domain-specific
knowledge for segmentation of adhering mode. The author of algorithm Liu extends the
adhering mode proposed by H.lkeda, etc. [12]. He puts forward seven types of horizontal
writing, and six types of vertical writing, as shown in Figure 5.
(a) horizontal writing adhesion type
(b) vertical writing adhesion type
Figure 5. Two types of adhesion
2) Detect division point though analysis of local contour shape. The adhering mode is
divided into single stroke area and many stroke more area. And then do contour analysis of
each single stroke area. Using the technology proposed by Rosenfeld and Johnston [13] to
detect the corner point for partial contour.
3) After the local contour shape analysis and segmentation, determine whether each
segmentation block need forced segmentation. Forced segmentation is produced because
Chinese character is a kind of relatively homogeneous Chinese characters. Therefore, the
segmentation block through the second step of connected components analysis and merger is
not likely to conform to height-width ratio, that is, the width is too wide, which is almost
impossible to exist in the Chinese characters (as shown in Figure 6).
Write multiple width of handwritten line segmentation mw as the span from far left to far
right in many stroke areas. If 3hmw LH  , this segmentation block need forced segmentation.
Define Eigen-Function the formula 4 below:
       , 1 , ,
y y
f x b x y b x y b x y
 
  
 
  (4)
Among them,  ,b x y represents the binary image of segmentation block, and take the
position where   cf x x x  ( cx is the center of the multiple width) gets the minimum as the
cut-off point. After completing pre-slitting, have one or more successive segmentation blocks
combined as candidate character mode. Tag the space between the characters before
dictionary matching.
Figure 6. Diagrammatic Sketch of Height-width Ratio of Text Segmentation Block
IJEEI ISSN: 2089-3272 
Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang)
167
3.4. Analysis of the Segmentation
In this section, analyzing the process of the Segmentation by one sample, which is
shown in Figure 7:
Figure 7. Experimental Sample Images
It can see from the sample image in Figure 7 that this sample image consists of those
words “商引向深入,切 有效地开展民主 督实 监 , 一步拓进 ”, which contain script, individual character and
punctuation mark, almost covering several testing situations of handwritten Chinese characters.
It will analyze the sample image by stages below.
1) The experimental results of image preprocessing, as shown in Figure 8
Figure 8. the Experimental Results of Image Preprocessing
After image preprocessing, text line image is composed of each black pixel point, which
is different from the text line image with different gray value. In the experiment, it gets the stroke
width SW , which is the two pixel point. Therefore, it doesn’t process, and convert the original
image into binary image.
2) The experimental results of connected components analysis, as shown in Figure 9:
Figure 9. the Experimental Results after Connected Components Analysis
In analyzing connected components, initially merge connected components recursive
into text segmentation block one by one through calculating the standardization overlapping
degree nmovlp . In Figure 9, the characters with left and right organization are written with
relatively apart, so that those can’t be merged into one text segmentation block. Thus make
them be identified as a word in dictionary matching. Those text segmentation blocks with
adhering mode, such as “民主” cannot be divided into two segmentation blocks and be regarded
as a text segmentation block, because the final cross of “主” is linked to “民”. Therefore, divide
text segmentation with adhering mode into two segmentation blocks or compulsorily to find the
division point through adhering mode segmentation. For example, the two words “一步” also have
the same problem. The segmentation block of punctuation marks obtained from above is very
small, which contributes little to height value of the whole text line. It finds 35 connected
components and 23 image segmentation blocks in the experiment. It gets the initial
segmentation block of text line image through connected components analysis.
4. The Experimental Results and Analysis
Because test a large number of sample images in the experiment, only showing the part
of results of the segmentation below, the sample images used are shown in Figure 10, Figure
11 and Figure 12.
 ISSN: 2089-3272
IJEEI Vol. 2, No. 4, December 2014 : 161 – 169
168
Figure 10. Experimental sample image
Figure 11. Experimental sample image
Figure 12. Experimental sample images
The words in sample images respectively are “品 企 建立了经营 业 索 索票证 等进货检查验证“ ” ”
,“本 哈报 尔滨 月 日电春 前夕节 ,哈 市尔滨1 20 ”, “据了解,目前,北京市 食品安全已 从田 到对 实现 间 ”,. In sample images,
these images contain the common situations, such as all kinds of font-style, script and
punctuation marks, etc. in handwriting. The corresponding experimental results are shown in
Figure 13, Figure 14 and Figure 15:
Figure 13. Images after Segmentation
Figure 14. Images after Segmentation
Figure 15. Images after Segmentation
It can conclude from the above experiments that algorithms Liu is very effective for
segmentation of handwritten text line image.
5. Conclusion
The experiments show that algorithms Liu avoids the influence of noise and the
information having nothing to do with text segmentation and ensures the segmentation quality.
Based on the judgment criterion of Segmentation block classification and adhering mode
according to the characteristics of the handwritten Chinese characters. The algorithm is more
effective and accurate for text segmentation with adhering mode. The seven types in horizontal
and six types in vertical ensure the search of division point of adhering mode. What’s more, the
parameters, 1h , 2h and 3h , in the experiment can be obtained from much experimental
IJEEI ISSN: 2089-3272 
Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang)
169
statistics. This adapts to the characteristics of samples, is more targeted, ensures the
segmentation effect and makes the algorithm have a better applied cost.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (60672057,
60975021), the Natural Science Foundation of South-Central University for Nationalities
(YZY10006), the Open Foundation of Key Laboratory of Education Ministry of China for Image
Processing and Intelligence Control (200906), the Science and Technique Research Guidance
Programs of Hubei Provincial Department of Education of China (B20110802), the Fundamental
Research Funds for the Central Universities”, South-Central University for Nationalities
(CZY12007, CTZ12013), and the Open Foundation of State Key Laboratory of Software
Engineering (SKLSE20120934), the Natural Science Foundation of Hubei Province of China
(2012FFB07404), the graduate Academic Innovation Foundation of South-Central University for
Nationalities.The authors would also like to thank the anonymous reviewers for their valuable
comments and constructive suggestions, which helped improve the quality of this paper.
References
[1] Xiaoyuan LI, Gang Tian, Chao Feng. Segmentation of Touching Characters in Printed Mathematical
Expression. Science Technology and Engineering (In Chinese). 2011; 11(3): 628-632.
[2] Tseng LY, Chen RC. Segmentation Handwritten Chinese Characters Based on Heuristic Merging of
Stroke Bounding Boxes and Dynamic Programming. IEEE Pattern Recognition Letters. 2007; 19:
963-973.
[3] Chen Hong. Segmentation and Recognition of Continuous Handwriting Chinese Text. International
Journal of Pattern Recognition and Artificial Intelligence. 1998; 12(2): 223-232.
[4] Zhao Yuming, Jaing Xingzhi, Shi Pengfei. Algorithm for off-line handwritten Chinese character
segmentation based on extracting and knowledge-based merging of stroke bounding boxes. Infrared
and Laser Engineering(In Chinese). 2002; 31(1): 23-27.
[5] SY Zhao, ZR Chi, PF Shi, H Yan. Two-stage Segmentation of Unconstrained Handwritten Chinese
Characters. Pattern Recognition. 2003; 36: 145-156.
[6] SY Zhao, ZR Chi, PF Shi, Q Wang. Handwritten Chinese Character Segmentation Using a Twotage
Approach. Document Analysis and Recognition. 2001; 6: 179-183.
[7] Jiangqing Wang, Wei Cao. Vertical Projection Characters Segmentation Based on Minimum
Threshold and Curve-Fitting. Journal of South-Central University for Nationalities (Natural Science
Edition. 2011; 30(4): 82-85.
[8] Zhao Shuyan, Guo Jie, Shi Pengfei. Segmentation of Connected Handwritten Chinese Characters
Based on Stroke Analysis and Background Thinning. Journal of Shanghai Jiaotong University (In
Chinese). 2003; 37(9): 1434-1437.
[9] Cao Wei. Clearance Segmentation Based on Multi-threshold and Multi-segmentation Strategy.
Computer & Digital Engineering (In Chinese). 2011; 39(1): 131-133.
[10] Li Xiaoyuan, Yang Fang, Zhang Wangbo. Segmentation of connected handwritten Chinese
characters based on structure cluster and strokes analysis. Computer Engineering and Applications
(In Chinese). 2008; 44(34): 163-165.
[11] Cheng-Lin Liu, Masashi Koga, Hiromichi Fujisawa. Lexicon Driven Segmentation and Recognition of
Handwritten Character Strings for Japanese Address Reading. IEEE Transactions on Pattern
Analysis and Machine Intelligence. 2002; 11: 1425-1437.
[12] Ikeda, Hisashi. A Recognition Method for Touching Japanese Handwritten Characters. International
Conference of Document Analysis and Recognition. 1999; 5: 641-644.
[13] A Rosenfeld and E Johnston. Angle Detection on Digital Curves. IEEE Trans.Computers. 1973; 22:
875-878.

More Related Content

What's hot

A New Approach to Segmentation of On-Line Persian Cursive Words
A New Approach to Segmentation of On-Line Persian Cursive WordsA New Approach to Segmentation of On-Line Persian Cursive Words
A New Approach to Segmentation of On-Line Persian Cursive WordsEditor IJCATR
 
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXTSEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXTcscpconf
 
Recognition of Offline Handwritten Hindi Text Using SVM
Recognition of Offline Handwritten Hindi Text Using SVMRecognition of Offline Handwritten Hindi Text Using SVM
Recognition of Offline Handwritten Hindi Text Using SVMCSCJournals
 
1.p13 078 marathi handwritten
1.p13 078 marathi handwritten1.p13 078 marathi handwritten
1.p13 078 marathi handwrittenTulashiram Pisal
 
V.karthikeyan published article
V.karthikeyan published articleV.karthikeyan published article
V.karthikeyan published articleKARTHIKEYAN V
 
Text Detection and Recognition: A Review
Text Detection and Recognition: A ReviewText Detection and Recognition: A Review
Text Detection and Recognition: A ReviewIRJET Journal
 
Online Hand Written Character Recognition
Online Hand Written Character RecognitionOnline Hand Written Character Recognition
Online Hand Written Character RecognitionIOSR Journals
 
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...ijaia
 
A New Method for Identification of Partially Similar Indian Scripts
A New Method for Identification of Partially Similar Indian ScriptsA New Method for Identification of Partially Similar Indian Scripts
A New Method for Identification of Partially Similar Indian ScriptsCSCJournals
 
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...iosrjce
 
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGES
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESA NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGES
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESijnlc
 
TEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSIS
TEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSISTEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSIS
TEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSISacijjournal
 
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMSAN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMSijcsit
 
OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...
OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...
OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...csandit
 
An OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq FontAn OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq FontDr. Syed Hassan Amin
 
A statistical model for gist generation a case study on hindi news article
A statistical model for gist generation  a case study on hindi news articleA statistical model for gist generation  a case study on hindi news article
A statistical model for gist generation a case study on hindi news articleIJDKP
 
IRJET- Identifying Human Behavior Characteristics using Handwriting Analysis
IRJET-  	  Identifying Human Behavior Characteristics using Handwriting AnalysisIRJET-  	  Identifying Human Behavior Characteristics using Handwriting Analysis
IRJET- Identifying Human Behavior Characteristics using Handwriting AnalysisIRJET Journal
 

What's hot (19)

A New Approach to Segmentation of On-Line Persian Cursive Words
A New Approach to Segmentation of On-Line Persian Cursive WordsA New Approach to Segmentation of On-Line Persian Cursive Words
A New Approach to Segmentation of On-Line Persian Cursive Words
 
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXTSEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT
SEGMENTATION OF CHARACTERS WITHOUT MODIFIERS FROM A PRINTED BANGLA TEXT
 
Recognition of Offline Handwritten Hindi Text Using SVM
Recognition of Offline Handwritten Hindi Text Using SVMRecognition of Offline Handwritten Hindi Text Using SVM
Recognition of Offline Handwritten Hindi Text Using SVM
 
1.p13 078 marathi handwritten
1.p13 078 marathi handwritten1.p13 078 marathi handwritten
1.p13 078 marathi handwritten
 
V.karthikeyan published article
V.karthikeyan published articleV.karthikeyan published article
V.karthikeyan published article
 
50120130406021
5012013040602150120130406021
50120130406021
 
Text Detection and Recognition: A Review
Text Detection and Recognition: A ReviewText Detection and Recognition: A Review
Text Detection and Recognition: A Review
 
Online Hand Written Character Recognition
Online Hand Written Character RecognitionOnline Hand Written Character Recognition
Online Hand Written Character Recognition
 
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...
 
A New Method for Identification of Partially Similar Indian Scripts
A New Method for Identification of Partially Similar Indian ScriptsA New Method for Identification of Partially Similar Indian Scripts
A New Method for Identification of Partially Similar Indian Scripts
 
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...
 
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGES
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESA NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGES
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGES
 
TEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSIS
TEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSISTEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSIS
TEXT CLASSIFICATION FOR AUTHORSHIP ATTRIBUTION ANALYSIS
 
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMSAN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
 
OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...
OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...
OCR-THE 3 LAYERED APPROACH FOR CLASSIFICATION AND IDENTIFICATION OF TELUGU HA...
 
An OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq FontAn OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq Font
 
A statistical model for gist generation a case study on hindi news article
A statistical model for gist generation  a case study on hindi news articleA statistical model for gist generation  a case study on hindi news article
A statistical model for gist generation a case study on hindi news article
 
IRJET- Identifying Human Behavior Characteristics using Handwriting Analysis
IRJET-  	  Identifying Human Behavior Characteristics using Handwriting AnalysisIRJET-  	  Identifying Human Behavior Characteristics using Handwriting Analysis
IRJET- Identifying Human Behavior Characteristics using Handwriting Analysis
 
Text classification
 Text classification Text classification
Text classification
 

Similar to Segmentation of Handwritten Chinese Character Strings Based on improved Algorithm Liu

Scene text recognition in mobile applications by character descriptor and str...
Scene text recognition in mobile applications by character descriptor and str...Scene text recognition in mobile applications by character descriptor and str...
Scene text recognition in mobile applications by character descriptor and str...eSAT Journals
 
An effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognitionAn effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognitionijaia
 
Recognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural NetworkRecognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural NetworkIJERA Editor
 
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...acijjournal
 
A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1Prasad Babu
 
Critical Review on Off-Line Sinhala Handwriting Recognition
Critical Review on Off-Line Sinhala Handwriting RecognitionCritical Review on Off-Line Sinhala Handwriting Recognition
Critical Review on Off-Line Sinhala Handwriting RecognitionAmila Wijayarathna
 
Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...ITIIIndustries
 
Design and implementation of optical character recognition using template mat...
Design and implementation of optical character recognition using template mat...Design and implementation of optical character recognition using template mat...
Design and implementation of optical character recognition using template mat...eSAT Journals
 
Optical Character Recognition from Text Image
Optical Character Recognition from Text ImageOptical Character Recognition from Text Image
Optical Character Recognition from Text ImageEditor IJCATR
 
A Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition TechniquesA Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition Techniquesijsrd.com
 
Devnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approachDevnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approachVikas Dongre
 
Text Extraction System by Eliminating Non-Text Regions
Text Extraction System by Eliminating Non-Text RegionsText Extraction System by Eliminating Non-Text Regions
Text Extraction System by Eliminating Non-Text RegionsIJCSIS Research Publications
 
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACHDEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACHijcseit
 
Fragmentation of Handwritten Touching Characters in Devanagari Script
Fragmentation of Handwritten Touching Characters in Devanagari ScriptFragmentation of Handwritten Touching Characters in Devanagari Script
Fragmentation of Handwritten Touching Characters in Devanagari ScriptZac Darcy
 
Persian arabic document segmentation based on hybrid approach
Persian arabic document segmentation based on hybrid approachPersian arabic document segmentation based on hybrid approach
Persian arabic document segmentation based on hybrid approachijcsa
 
Header Based Classification of Journals Using Document Image Segmentation and...
Header Based Classification of Journals Using Document Image Segmentation and...Header Based Classification of Journals Using Document Image Segmentation and...
Header Based Classification of Journals Using Document Image Segmentation and...CSCJournals
 
A Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition SystemA Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
 

Similar to Segmentation of Handwritten Chinese Character Strings Based on improved Algorithm Liu (20)

Scene text recognition in mobile applications by character descriptor and str...
Scene text recognition in mobile applications by character descriptor and str...Scene text recognition in mobile applications by character descriptor and str...
Scene text recognition in mobile applications by character descriptor and str...
 
F045053236
F045053236F045053236
F045053236
 
An effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognitionAn effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognition
 
Recognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural NetworkRecognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural Network
 
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USI...
 
A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1
 
Critical Review on Off-Line Sinhala Handwriting Recognition
Critical Review on Off-Line Sinhala Handwriting RecognitionCritical Review on Off-Line Sinhala Handwriting Recognition
Critical Review on Off-Line Sinhala Handwriting Recognition
 
Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...
 
Design and implementation of optical character recognition using template mat...
Design and implementation of optical character recognition using template mat...Design and implementation of optical character recognition using template mat...
Design and implementation of optical character recognition using template mat...
 
Optical Character Recognition from Text Image
Optical Character Recognition from Text ImageOptical Character Recognition from Text Image
Optical Character Recognition from Text Image
 
A Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition TechniquesA Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition Techniques
 
Devnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approachDevnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approach
 
Text Extraction System by Eliminating Non-Text Regions
Text Extraction System by Eliminating Non-Text RegionsText Extraction System by Eliminating Non-Text Regions
Text Extraction System by Eliminating Non-Text Regions
 
Isolated Kannada Character Recognition using Chain Code Features
Isolated Kannada Character Recognition using Chain Code FeaturesIsolated Kannada Character Recognition using Chain Code Features
Isolated Kannada Character Recognition using Chain Code Features
 
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACHDEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
 
Fragmentation of Handwritten Touching Characters in Devanagari Script
Fragmentation of Handwritten Touching Characters in Devanagari ScriptFragmentation of Handwritten Touching Characters in Devanagari Script
Fragmentation of Handwritten Touching Characters in Devanagari Script
 
O45018291
O45018291O45018291
O45018291
 
Persian arabic document segmentation based on hybrid approach
Persian arabic document segmentation based on hybrid approachPersian arabic document segmentation based on hybrid approach
Persian arabic document segmentation based on hybrid approach
 
Header Based Classification of Journals Using Document Image Segmentation and...
Header Based Classification of Journals Using Document Image Segmentation and...Header Based Classification of Journals Using Document Image Segmentation and...
Header Based Classification of Journals Using Document Image Segmentation and...
 
A Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition SystemA Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition System
 

More from ijeei-iaes

An Heterogeneous Population-Based Genetic Algorithm for Data Clustering
An Heterogeneous Population-Based Genetic Algorithm for Data ClusteringAn Heterogeneous Population-Based Genetic Algorithm for Data Clustering
An Heterogeneous Population-Based Genetic Algorithm for Data Clusteringijeei-iaes
 
Development of a Wireless Sensors Network for Greenhouse Monitoring and Control
Development of a Wireless Sensors Network for Greenhouse Monitoring and ControlDevelopment of a Wireless Sensors Network for Greenhouse Monitoring and Control
Development of a Wireless Sensors Network for Greenhouse Monitoring and Controlijeei-iaes
 
Analysis of Genetic Algorithm for Effective power Delivery and with Best Upsurge
Analysis of Genetic Algorithm for Effective power Delivery and with Best UpsurgeAnalysis of Genetic Algorithm for Effective power Delivery and with Best Upsurge
Analysis of Genetic Algorithm for Effective power Delivery and with Best Upsurgeijeei-iaes
 
Design for Postplacement Mousing based on GSM in Long-Distance
Design for Postplacement Mousing based on GSM in Long-DistanceDesign for Postplacement Mousing based on GSM in Long-Distance
Design for Postplacement Mousing based on GSM in Long-Distanceijeei-iaes
 
Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...
Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...
Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...ijeei-iaes
 
Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...
Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...
Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...ijeei-iaes
 
Mitigation of Power Quality Problems Using Custom Power Devices: A Review
Mitigation of Power Quality Problems Using Custom Power Devices: A ReviewMitigation of Power Quality Problems Using Custom Power Devices: A Review
Mitigation of Power Quality Problems Using Custom Power Devices: A Reviewijeei-iaes
 
Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...
Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...
Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...ijeei-iaes
 
Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...
Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...
Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...ijeei-iaes
 
Intelligent Management on the Home Consumers with Zero Energy Consumption
Intelligent Management on the Home Consumers with Zero Energy ConsumptionIntelligent Management on the Home Consumers with Zero Energy Consumption
Intelligent Management on the Home Consumers with Zero Energy Consumptionijeei-iaes
 
Analysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic ToolsAnalysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic Toolsijeei-iaes
 
A Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur ClassificationA Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur Classificationijeei-iaes
 
Computing Some Degree-Based Topological Indices of Graphene
Computing Some Degree-Based Topological Indices of GrapheneComputing Some Degree-Based Topological Indices of Graphene
Computing Some Degree-Based Topological Indices of Grapheneijeei-iaes
 
A Lyapunov Based Approach to Enchance Wind Turbine Stability
A Lyapunov Based Approach to Enchance Wind Turbine StabilityA Lyapunov Based Approach to Enchance Wind Turbine Stability
A Lyapunov Based Approach to Enchance Wind Turbine Stabilityijeei-iaes
 
Fuzzy Control of a Large Crane Structure
Fuzzy Control of a Large Crane StructureFuzzy Control of a Large Crane Structure
Fuzzy Control of a Large Crane Structureijeei-iaes
 
Site Diversity Technique Application on Rain Attenuation for Lagos
Site Diversity Technique Application on Rain Attenuation for LagosSite Diversity Technique Application on Rain Attenuation for Lagos
Site Diversity Technique Application on Rain Attenuation for Lagosijeei-iaes
 
Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...
Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...
Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...ijeei-iaes
 
Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...ijeei-iaes
 
A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...ijeei-iaes
 
Wireless Sensor Network for Radiation Detection
Wireless Sensor Network for Radiation DetectionWireless Sensor Network for Radiation Detection
Wireless Sensor Network for Radiation Detectionijeei-iaes
 

More from ijeei-iaes (20)

An Heterogeneous Population-Based Genetic Algorithm for Data Clustering
An Heterogeneous Population-Based Genetic Algorithm for Data ClusteringAn Heterogeneous Population-Based Genetic Algorithm for Data Clustering
An Heterogeneous Population-Based Genetic Algorithm for Data Clustering
 
Development of a Wireless Sensors Network for Greenhouse Monitoring and Control
Development of a Wireless Sensors Network for Greenhouse Monitoring and ControlDevelopment of a Wireless Sensors Network for Greenhouse Monitoring and Control
Development of a Wireless Sensors Network for Greenhouse Monitoring and Control
 
Analysis of Genetic Algorithm for Effective power Delivery and with Best Upsurge
Analysis of Genetic Algorithm for Effective power Delivery and with Best UpsurgeAnalysis of Genetic Algorithm for Effective power Delivery and with Best Upsurge
Analysis of Genetic Algorithm for Effective power Delivery and with Best Upsurge
 
Design for Postplacement Mousing based on GSM in Long-Distance
Design for Postplacement Mousing based on GSM in Long-DistanceDesign for Postplacement Mousing based on GSM in Long-Distance
Design for Postplacement Mousing based on GSM in Long-Distance
 
Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...
Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...
Investigation of TTMC-SVPWM Strategies for Diode Clamped and Cascaded H-bridg...
 
Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...
Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...
Optimal Power Flow with Reactive Power Compensation for Cost And Loss Minimiz...
 
Mitigation of Power Quality Problems Using Custom Power Devices: A Review
Mitigation of Power Quality Problems Using Custom Power Devices: A ReviewMitigation of Power Quality Problems Using Custom Power Devices: A Review
Mitigation of Power Quality Problems Using Custom Power Devices: A Review
 
Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...
Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...
Comparison of Dynamic Stability Response of A SMIB with PI and Fuzzy Controll...
 
Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...
Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...
Embellished Particle Swarm Optimization Algorithm for Solving Reactive Power ...
 
Intelligent Management on the Home Consumers with Zero Energy Consumption
Intelligent Management on the Home Consumers with Zero Energy ConsumptionIntelligent Management on the Home Consumers with Zero Energy Consumption
Intelligent Management on the Home Consumers with Zero Energy Consumption
 
Analysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic ToolsAnalysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic Tools
 
A Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur ClassificationA Pattern Classification Based approach for Blur Classification
A Pattern Classification Based approach for Blur Classification
 
Computing Some Degree-Based Topological Indices of Graphene
Computing Some Degree-Based Topological Indices of GrapheneComputing Some Degree-Based Topological Indices of Graphene
Computing Some Degree-Based Topological Indices of Graphene
 
A Lyapunov Based Approach to Enchance Wind Turbine Stability
A Lyapunov Based Approach to Enchance Wind Turbine StabilityA Lyapunov Based Approach to Enchance Wind Turbine Stability
A Lyapunov Based Approach to Enchance Wind Turbine Stability
 
Fuzzy Control of a Large Crane Structure
Fuzzy Control of a Large Crane StructureFuzzy Control of a Large Crane Structure
Fuzzy Control of a Large Crane Structure
 
Site Diversity Technique Application on Rain Attenuation for Lagos
Site Diversity Technique Application on Rain Attenuation for LagosSite Diversity Technique Application on Rain Attenuation for Lagos
Site Diversity Technique Application on Rain Attenuation for Lagos
 
Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...
Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...
Impact of Next Generation Cognitive Radio Network on the Wireless Green Eco s...
 
Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...
 
A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...A Real-Time Implementation of Moving Object Action Recognition System Based o...
A Real-Time Implementation of Moving Object Action Recognition System Based o...
 
Wireless Sensor Network for Radiation Detection
Wireless Sensor Network for Radiation DetectionWireless Sensor Network for Radiation Detection
Wireless Sensor Network for Radiation Detection
 

Recently uploaded

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 

Recently uploaded (20)

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 

Segmentation of Handwritten Chinese Character Strings Based on improved Algorithm Liu

  • 1. Indonesian Journal of Electrical Engineering and Informatics (IJEEI) Vol. 2, No. 4, December 2014, pp. 161~169 ISSN: 2089-3272  161 Received June 16, 2014; Revised August 10, 2014; Accepted September 2, 2014 Segmentation of Handwritten Chinese Character Strings Based on improved Algorithm Liu Zhihua Cai, Jiangqing Wang*, Yangguang Sun College of Computer Science, South-Central University for Nationalities, Wuhan 430074, China *Corresponding author, e-mail: hunterpaper2199@gmail.com Abstract Algorithm Liu attracts high attention because of its high accuracy in segmentation of Japanese postal address. But the disadvantages, such as complexity and difficult implementation of algorithm, etc. have an adverse effect on its popularization and application. In this paper, the author applies the principles of algorithm Liu to handwritten Chinese character segmentation according to the characteristics of the handwritten Chinese characters, based on deeply study on algorithm Liu.In the same time, the author put forward the judgment criterion of Segmentation block classification and adhering mode of the handwritten Chinese characters.In the process of segmentation, text images are seen as the sequence made up of Connected Components (CCs), while the connected components are made up of several horizontal itinerary set of black pixels in image. The author determines whether these parts will be merged into segmentation through analyzing connected components. And then the author does image segmentation through adhering mode based on the analysis of outline edges. Finally cut the text images into character segmentation. Experimental results show that the improved Algorithm Liu obtains high segmentation accuracy and produces a satisfactory segmentation result. Keywords: Character Strings segmentation, Character recognition, Connected components analysis, Merged characters segmentation 1. Introduction Character segmentation is one of the key technologies in text recognition rate. The object of character segmentation can be divided into printed character, handwritten Chinese character, handwritten western character, etc. It is need to choose different segmentation methods for different object generally. The scholars and experts at home and abroad do a lot of researches on segmentation of western character and numbers [1] and have made a lot of valuable achievements. But Chinese character segmentation always focuses on printed character. There are few researches on handwritten Chinese character segmentation, so the corresponding achievements are also few. Because Handwritten Chinese character is different from printed Chinese character. Briefly, printed character is mechanical segmentation of two dimensional planes, which is standard and easy to recognize. While handwriting is lively, vivid and writing model with high artistry, and handwriting pays attention to the shape, generalization and change of stroke image. What’s more, there must be natural and fluent coherent and echo between stroke and stroke. Therefore, there are conditions of script in handwriting Chinese character and strokes are changing, as shown in Fig.1. There is certain tilt in the font in this handwriting image and different spacing between line and line. Meanwhile, the space between word and word is different. Some are written relatively tensely, while others are written relatively sparse. These characteristics are all distinctive from the printed text. Therefore, the segmentation of handwritten Chinese character is more complicated. Certainly, there are some achievements which are worth noticing. A segmentation method which is based on stroke segment drawing is proposed by Tseng L.Y., etc. [2]. They adopts regulatory computing cost matrix of components merger based on knowledge, and then obtain text image segmentation by using dynamic programming. Chen Hong et al. [3] make use of the characteristics of width of characters and spacing, etc. to do character segmentation in text through the principle of minimum variance. Zhao Yuming et al. [4] propose a handwritten Chinese character segmentation method based on stroke extraction and merging. Zhao et al. [5-6] propose two-stage segmentation method of coarse segmentation and fine segmentation. Jiangqing Wang et al. [7] propose vertical projection characters segmentation based on minimum threshold and curve-fitting. Zhao Shuyan etc. [8] propose merged handwritten
  • 2.  ISSN: 2089-3272 IJEEI Vol. 2, No. 4, December 2014 : 161 – 169 162 Chinese character segmentation based on stroke analysis and background thinning. Cao Wei [9] proposes segmentation of gap algorithm based on multi-threshold and multi- segmentation strategy. Li Xiaoyuan etc. [10] propose merged handwritten Chinese character segmentation based on structural cluster analysis and stroke analysis. These methods promote the recognition of Chinese character to a certain extent, but there are still some shortcomings, such as recognition accuracy and recognition speed, etc., because recognition of Chinese character is under the influence of some objective factors, such as writing quality, writing style, font size, and image quality of handwritten Chinese character on computer, etc. Algorithm Liu [11] is proposed by Cheng-Lin Liu, Masashi Koga and Hiromichi Fujisawa. It is acquired with high accuracy in Japanese postal address segmentation, but its algorithm is complex and its implementation is difficulty, which have negative effects on its popularization and application. In this paper, the author applies the principles of algorithm Liu to handwritten Chinese character segmentation according to the characteristics of the handwritten Chinese characters, based on deeply study on algorithm Liu.In the same time, the author put forward the judgment criterion of Segmentation block classification and adhering mode of the handwritten Chinese characters. In the segmentation process, the text line is considered as the sequence of CCs and each CCs is comprised of a set of black runs. It is determined whether CCs is merged or not by analysis of CCs. Then, the segmentation block are segmented into character segmentation block by touching patterns segmentation based on analysis of partial contours. Experimental results show that the new algorithm has a high precision rate. Figure 1. The image of Handwritten Chinese character 2. Algorithm Liu Liu algorithm is the method of character segmentation based on connected components [11]. In the process of segmentation, text images are seen as the sequence made up of Connected Components (CCs), while the connected components are made up of several horizontal itinerary set of black pixels in image. And then analyze each connected component recursion. The analysis process is that calculate the standard overlap. If the standard overlap of the two parts is positive, this two parts can be merged. Calculating the standard overlap is similar to calculating relevancy to the two parts. Only the standard overlap is high enough, and it thinks that the two parts can be merged. The components which complete the merger are called text segmentation. Because text segmentation after merging exist adhesion owing to handwriting, that is, segmentation after merging contains two character, conducting adhering mode segmentation based on outline edges analysis. Segmentation process is to do outline analysis of text segmentation which is inconsistent with height-width ratio, and detect point of division. If the point of division is not found, judge again whether text segmentation conforms to height-width ratio of compulsory segmentation. If it meets, do compulsory segmentation of it. If not, give up compulsory segmentation. If you could find the point of division, do segmentation
  • 3. IJEEI ISSN: 2089-3272  Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang) 163 on this text segmentation. Don’t do segmentation on those which are consistent with height- width ratio. Finally, get the complete text segmentation. Algorithm Liu is effective for segmentation of handwriting text image which exist adhering mode, and find point of division and complete segmentation accurately. This algorithm is a complete recognition system of postal address, including image preprocessing, connected component analysis, adhering mode segmentation and a search algorithm based on the classifier of character and the dictionary. The system flowchart is shown in Figure 2: Figure 2. The Character String Recognition (CSR) System The whole system begins with a text image. Its writing direction is from left to right or from top to bottom. The text image is presented by Connected Components (CCs). The system will process text image in sequence until it can output an address phrases. The accuracy of these 3589 actual email texted by the research group is almost to the point of 83.86% and the error rate below 1%. Visibly, this algorithm is effective for the segmentation of Japanese postal address. 3. The Segmentation of Handwritten Chinese Character Based on Algorithm Liu 3.1. Image Preprocessing There may be some factors, such as fuzziness and alteration, etc. affecting handwritten text, and the problems, such as un-clarity and noise, etc. in its image. Therefore, to estimate stroke width of text image from Run Length Histogram of Connected Components in order to control image filtering effect. Firstly, to find stroke width-- maxrl , which appears the most, that is, the maximum probability from Run Length Histogram. But this value does not represent the stroke width of handwriting text image, because there may be other noise or other information contributing to this probability. Therefore, the mean value of studying run length histogram, meanrl , is calculated by the formula 1 below:     1 1 int i mean i i hist i rl hist i           (1) Among them, int( ) is round off the mean value calculated. If max meanrl rl , maxSW rl , otherwise, find a value which satisfies the formula 2 between maxrl and meanrl .  1 1 max 1 ( ) 2 localhist rl hist rl (2) localSW rl ,or maxSW rl .
  • 4.  ISSN: 2089-3272 IJEEI Vol. 2, No. 4, December 2014 : 161 – 169 164 Estimating stroke width of text through runs length histogram. All CCs units which are smaller than SW SW in the black area are removed. If 3SW  , will not do any further smoothing process on text image in order to prevent the fracture of thin lines; Otherwise, if 7SW  , the image will do morphological open handle by using the structure of5 5 ; otherwise, if 5SW  , the image will do morphological open handle by using the structure of3 3 ; if it is no less, the image will do median filter smoothing in the area of3 3 . Doing morphological open handle is effective for eliminating filament and noise. The structure template used is shown in Figure 3: (a) Structure template of 3 3 (b) Structure template of 5 5 Figure 3. Structure template used by morphology Stroke width ( SW ) is the key to get binary image. Once it cannot get correct value of SW , it will affect the final result after entire algorithm processing. To do segmentation of character in image, the wrong stroke width will delete or dispose of the black pixels contained in the image, and then binary image obtained cannot reflect the text messages correctly, and of course, it cannot complete text segmentation. 3.2. Connected Components Analysis Handwritten text image is seen as a sequence of CCs, while each CCs is composed of horizontal stroke set of several black pixels. Because CCs overlap is likely to constitute the same character, calculate its standardized overlapping degree to decide whether to merge it. Calculation process is shown below. Bounding box of components is appointed by coordinate of left, right, upper and lower boundaries. The bounding box of the two parts are respectively ( 1 l x , 1 r x , 1 t y , 1 b y ) and ( 2 l x , 2 r x , 2 t y , 2 b y ). Suppose 1 2 l l x x , if 2 1 l r x x , the two parts overlap, the overlapping degree is 1 2 r l ovlp x x  , span is  1 2max ,r r dist span x x span   , nmovlp is calculated by formula 3 below. 1 2 1 2 ovlp ovlp dist nmovlp w w span         (3) Among them, 1w and 2w respectively represent the width of two parts, dist represents the horizontal distance of the center. If nmovlp , merge the two parts. As long as there is large enough standardized overlapping degree, to merge the current parts and its subsequent parts. Figure 4 shows diagrammatic sketch of merging two parts. 1O and 2O represent the center of the two parts. Merger of components adopts recursive combined method. After recursive merger, each part can be seen as image segmentation. And then detect the potential adhering mode of segmentation and cut it. If you think segmentation is still likely to be multi-model after segmentation is completed, you can force to cut the block according to its projection histogram.
  • 5. IJEEI ISSN: 2089-3272  Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang) 165 Figure 4. Arc strokes consisting of two segments Connected components are the basic unit for analyzing the text segmentation. There are script and broken in writing. According to the definition of the connected components in the algorithm, connected components are identified as the set of a series of consecutive black pixel point. All points which can be connected can form a connected component in the text image. Here connected points are not only the points connecting up, down, left and right, but also the points connecting the upper left, lower left, upper right and lower right, that is, eight connecting area. Because characters are not written under the situation of “pure” horizontally and “pure” vertically, but written with a certain angle. Determination of connected components is the foundation of merging connected components. By adopting the method in this section, it could merge connected components into segmentation block, and then do adhering mode analysis next. 3.3. Adhering Mode Segmentation 1) The text segmentation block concluded from by the previous two steps does not represent the final result of segmentation. Therefore, there may be “under-segmentation” in text segmentation block at this time. So-called under-segmentation is that the two segmentation blocks which should be independent merge because of the too close writing. This merger is not the defect which algorithm itself design, but the writing. So, detect the adhering mode of segmentation and find out the point of division, in order to separate the two segmentation blocks which are under-segmentation. First estimate the height of text line ( LH ) according to histogram of image segmentation block after CCs merging. The segmentation block produced after connected components analysis combining can be divided into big segmentation block and small segmentation block. The standard of judging small segmentation block is determined by the maximum frequency of the height of segmentation block calculated. The segmentation block which is smaller than 1/2 is regarded as small segmentation block. The reason is that Chinese character is relatively homogeneous. When the segmentation blocks have cut characters out through combination of the connected components analysis, their height will be more uniform. There are several punctuation marks whose height is small after being split out. But the number of segmentation block of punctuation marks is relatively less than that of segmentation block of text. Therefore, it could determine directly the height of small segmentation block according to the height of the segmentation block with the largest frequency. Given the height of small segmentation block had a little effect on character height, just merging segmentation block temporarily, and then the height and width of temporary segmentation are respectively th and tw. The histogram    2 2hist th hist th tw  is updated. The height of text line is regarded as an estimated value of the height mean value of temporary block in the histogram. Suppose the width of segmentation block is SW , the height is SH . If 1hSW LH  or 2/ hSW SH  , this segmentation block is likely to be adhering mode. The height-to-width of Chinese characters is generally stable because Chinese characters are relatively homogeneous. Because there are few characters which are too wide and with height within the text line height LH , or it is almost impossible to exist, it is likely to merge multiple connected components into a text segmentation block for adhesion in the
  • 6.  ISSN: 2089-3272 IJEEI Vol. 2, No. 4, December 2014 : 161 – 169 166 connected component analysis. It can determine the value of parameters, 1h and 2h , through a lot of experiments. It is reasonable to think that there is adhesion in the text segmentation blocks whose height-width ratios don’t conform to these two formulas. Use the domain-specific knowledge for segmentation of adhering mode. The author of algorithm Liu extends the adhering mode proposed by H.lkeda, etc. [12]. He puts forward seven types of horizontal writing, and six types of vertical writing, as shown in Figure 5. (a) horizontal writing adhesion type (b) vertical writing adhesion type Figure 5. Two types of adhesion 2) Detect division point though analysis of local contour shape. The adhering mode is divided into single stroke area and many stroke more area. And then do contour analysis of each single stroke area. Using the technology proposed by Rosenfeld and Johnston [13] to detect the corner point for partial contour. 3) After the local contour shape analysis and segmentation, determine whether each segmentation block need forced segmentation. Forced segmentation is produced because Chinese character is a kind of relatively homogeneous Chinese characters. Therefore, the segmentation block through the second step of connected components analysis and merger is not likely to conform to height-width ratio, that is, the width is too wide, which is almost impossible to exist in the Chinese characters (as shown in Figure 6). Write multiple width of handwritten line segmentation mw as the span from far left to far right in many stroke areas. If 3hmw LH  , this segmentation block need forced segmentation. Define Eigen-Function the formula 4 below:        , 1 , , y y f x b x y b x y b x y          (4) Among them,  ,b x y represents the binary image of segmentation block, and take the position where   cf x x x  ( cx is the center of the multiple width) gets the minimum as the cut-off point. After completing pre-slitting, have one or more successive segmentation blocks combined as candidate character mode. Tag the space between the characters before dictionary matching. Figure 6. Diagrammatic Sketch of Height-width Ratio of Text Segmentation Block
  • 7. IJEEI ISSN: 2089-3272  Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang) 167 3.4. Analysis of the Segmentation In this section, analyzing the process of the Segmentation by one sample, which is shown in Figure 7: Figure 7. Experimental Sample Images It can see from the sample image in Figure 7 that this sample image consists of those words “商引向深入,切 有效地开展民主 督实 监 , 一步拓进 ”, which contain script, individual character and punctuation mark, almost covering several testing situations of handwritten Chinese characters. It will analyze the sample image by stages below. 1) The experimental results of image preprocessing, as shown in Figure 8 Figure 8. the Experimental Results of Image Preprocessing After image preprocessing, text line image is composed of each black pixel point, which is different from the text line image with different gray value. In the experiment, it gets the stroke width SW , which is the two pixel point. Therefore, it doesn’t process, and convert the original image into binary image. 2) The experimental results of connected components analysis, as shown in Figure 9: Figure 9. the Experimental Results after Connected Components Analysis In analyzing connected components, initially merge connected components recursive into text segmentation block one by one through calculating the standardization overlapping degree nmovlp . In Figure 9, the characters with left and right organization are written with relatively apart, so that those can’t be merged into one text segmentation block. Thus make them be identified as a word in dictionary matching. Those text segmentation blocks with adhering mode, such as “民主” cannot be divided into two segmentation blocks and be regarded as a text segmentation block, because the final cross of “主” is linked to “民”. Therefore, divide text segmentation with adhering mode into two segmentation blocks or compulsorily to find the division point through adhering mode segmentation. For example, the two words “一步” also have the same problem. The segmentation block of punctuation marks obtained from above is very small, which contributes little to height value of the whole text line. It finds 35 connected components and 23 image segmentation blocks in the experiment. It gets the initial segmentation block of text line image through connected components analysis. 4. The Experimental Results and Analysis Because test a large number of sample images in the experiment, only showing the part of results of the segmentation below, the sample images used are shown in Figure 10, Figure 11 and Figure 12.
  • 8.  ISSN: 2089-3272 IJEEI Vol. 2, No. 4, December 2014 : 161 – 169 168 Figure 10. Experimental sample image Figure 11. Experimental sample image Figure 12. Experimental sample images The words in sample images respectively are “品 企 建立了经营 业 索 索票证 等进货检查验证“ ” ” ,“本 哈报 尔滨 月 日电春 前夕节 ,哈 市尔滨1 20 ”, “据了解,目前,北京市 食品安全已 从田 到对 实现 间 ”,. In sample images, these images contain the common situations, such as all kinds of font-style, script and punctuation marks, etc. in handwriting. The corresponding experimental results are shown in Figure 13, Figure 14 and Figure 15: Figure 13. Images after Segmentation Figure 14. Images after Segmentation Figure 15. Images after Segmentation It can conclude from the above experiments that algorithms Liu is very effective for segmentation of handwritten text line image. 5. Conclusion The experiments show that algorithms Liu avoids the influence of noise and the information having nothing to do with text segmentation and ensures the segmentation quality. Based on the judgment criterion of Segmentation block classification and adhering mode according to the characteristics of the handwritten Chinese characters. The algorithm is more effective and accurate for text segmentation with adhering mode. The seven types in horizontal and six types in vertical ensure the search of division point of adhering mode. What’s more, the parameters, 1h , 2h and 3h , in the experiment can be obtained from much experimental
  • 9. IJEEI ISSN: 2089-3272  Segmentation of Handwritten Chinese Character Strings Based on … (Jiangqing Wang) 169 statistics. This adapts to the characteristics of samples, is more targeted, ensures the segmentation effect and makes the algorithm have a better applied cost. Acknowledgments This work is supported by the National Natural Science Foundation of China (60672057, 60975021), the Natural Science Foundation of South-Central University for Nationalities (YZY10006), the Open Foundation of Key Laboratory of Education Ministry of China for Image Processing and Intelligence Control (200906), the Science and Technique Research Guidance Programs of Hubei Provincial Department of Education of China (B20110802), the Fundamental Research Funds for the Central Universities”, South-Central University for Nationalities (CZY12007, CTZ12013), and the Open Foundation of State Key Laboratory of Software Engineering (SKLSE20120934), the Natural Science Foundation of Hubei Province of China (2012FFB07404), the graduate Academic Innovation Foundation of South-Central University for Nationalities.The authors would also like to thank the anonymous reviewers for their valuable comments and constructive suggestions, which helped improve the quality of this paper. References [1] Xiaoyuan LI, Gang Tian, Chao Feng. Segmentation of Touching Characters in Printed Mathematical Expression. Science Technology and Engineering (In Chinese). 2011; 11(3): 628-632. [2] Tseng LY, Chen RC. Segmentation Handwritten Chinese Characters Based on Heuristic Merging of Stroke Bounding Boxes and Dynamic Programming. IEEE Pattern Recognition Letters. 2007; 19: 963-973. [3] Chen Hong. Segmentation and Recognition of Continuous Handwriting Chinese Text. International Journal of Pattern Recognition and Artificial Intelligence. 1998; 12(2): 223-232. [4] Zhao Yuming, Jaing Xingzhi, Shi Pengfei. Algorithm for off-line handwritten Chinese character segmentation based on extracting and knowledge-based merging of stroke bounding boxes. Infrared and Laser Engineering(In Chinese). 2002; 31(1): 23-27. [5] SY Zhao, ZR Chi, PF Shi, H Yan. Two-stage Segmentation of Unconstrained Handwritten Chinese Characters. Pattern Recognition. 2003; 36: 145-156. [6] SY Zhao, ZR Chi, PF Shi, Q Wang. Handwritten Chinese Character Segmentation Using a Twotage Approach. Document Analysis and Recognition. 2001; 6: 179-183. [7] Jiangqing Wang, Wei Cao. Vertical Projection Characters Segmentation Based on Minimum Threshold and Curve-Fitting. Journal of South-Central University for Nationalities (Natural Science Edition. 2011; 30(4): 82-85. [8] Zhao Shuyan, Guo Jie, Shi Pengfei. Segmentation of Connected Handwritten Chinese Characters Based on Stroke Analysis and Background Thinning. Journal of Shanghai Jiaotong University (In Chinese). 2003; 37(9): 1434-1437. [9] Cao Wei. Clearance Segmentation Based on Multi-threshold and Multi-segmentation Strategy. Computer & Digital Engineering (In Chinese). 2011; 39(1): 131-133. [10] Li Xiaoyuan, Yang Fang, Zhang Wangbo. Segmentation of connected handwritten Chinese characters based on structure cluster and strokes analysis. Computer Engineering and Applications (In Chinese). 2008; 44(34): 163-165. [11] Cheng-Lin Liu, Masashi Koga, Hiromichi Fujisawa. Lexicon Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002; 11: 1425-1437. [12] Ikeda, Hisashi. A Recognition Method for Touching Japanese Handwritten Characters. International Conference of Document Analysis and Recognition. 1999; 5: 641-644. [13] A Rosenfeld and E Johnston. Angle Detection on Digital Curves. IEEE Trans.Computers. 1973; 22: 875-878.