Fuzzy rule based classification and recognition of handwritten hindi
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Fuzzy rule based classification and recognition of handwritten hindi

on

  • 281 views

 

Statistics

Views

Total Views
281
Views on SlideShare
281
Embed Views
0

Actions

Likes
0
Downloads
10
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Fuzzy rule based classification and recognition of handwritten hindi Document Transcript

  • 1. INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME & TECHNOLOGY (IJCET)ISSN 0976 – 6367(Print)ISSN 0976 – 6375(Online)Volume 4, Issue 1, January- February (2013), pp. 337-357 IJCET© IAEME:www.iaeme.com/ijcet.aspJournal Impact Factor (2012): 3.9580 (Calculated by GISI) ©IAEMEwww.jifactor.com FUZZY RULE BASED CLASSIFICATION AND RECOGNITION OF HANDWRITTEN HINDI CURVE SCRIPT Gunjan Singh1, Avinash Pokhriyal1, Sushma Lehri2 1 ( Faculty of Management & Computer Application, RBS College, Agra, India.) 2 (Professor, I ET, Dr. B. R. Ambedkar University, Agra, India.) ABSTRACT This paper presents a novel system for classification and recognition of handwritten Hindi script using fuzzy rule based approach. Classification & recognition of handwritten Hindi script is a complex task as characters are cursive in nature and demonstrate a lot of similar features. The quality of fuzzy logic to deal with vague and imprecise data makes it appropriate for such problems. In this paper, we focus on two or three letter words without modifiers. Prior to recognition, handwritten words are preprocessed and segmented into individual characters. The performance of an optical character recognition system extremely depends on the procedure used to extract quality features from characters. During classification stage characters are classified into seven classes using fuzzy if-then rules based on one of the most important component of Hindi characters – the vertical bar. Features such as curves, lines, junction points and endpoints are used at the recognition stage. A 3x3 mask is used to extract features from character image. System was tested for total 450 words written by 30 different people. Experimental results show that the proposed method performs classification and recognition at the rate of 92.02%. The proposed system has been implemented in MATLAB 2009 environment. Keywords: Classification, Fuzzy rule based approach, Handwritten Hindi curve script, Vertical bar, 8-neighbourhood 337
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME1. INTRODUCTION Character recognition is a broad field in which all types of machine recognition ofcharacters in various application domains is studied. It includes the recognition of machineprinted as well as hand written characters. Recognition of machine printed charactersinvolves the recognition of characters written by a machine, while handwritten characterrecognition includes the recognition of characters written by human being either online oroffline. Recognition of machine printed characters is easy as characters are of same size, font& thickness and have a proper shape, but due to various writing styles, hand written characterrecognition is difficult as characters may be of different sizes, width and orientation. Acomparison of both approaches is given in [1]. In this paper, we will present a fuzzy rulebased classification and recognition system for handwritten Hindi script. Hindi is one of the official languages of India. It is world’s third most commonly usedlanguage after Chinese and English. Hindi script has 13 vowels (‘SWARS’) and 33consonants (‘VYANJANS’) in its basic character set. All the characters have two commonfeatures – (i) their cursive nature and, (ii) presence of header line (‘SHIROREKHA’). Headerline is a powerful tool of Hindi language. These features differentiatethe script from English and other Latin scripts. Words are formed by combining characters,half characters and /or modifiers using header line. Fig.1 shows basic character set, a list ofmodifiers and few words. (b) (a) (c) Figure 1(a). Basic character set, (b) Swars (vowels) & corresponding matras (modifiers) and (c) Few Hindi language words Now-a-days Hindi is being used worldwide in many fields such as banking, medical,science and technology etc. Most of the Hindi language words are being included in world’sbest dictionaries and other vocabulary developing tools. Due to the increasing popularity,automatic Hindi language recognition systems have now become important. Research in thisarea started in early 1970s. In 1977, Sethi and Chatterjee [2] presented a constrainedrecognition system for handwritten Hindi characters. In [3], Sinha and Mahabala presented asyntactic pattern analysis system for the recognition of machine printed and handwrittencharacters. The first complete OCR system for machine printed characters is presented in [4].Recognition of handwritten Hindi characters is still difficult for a machine as characters are 338
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEcursive in nature and show a lot of similarities such as presence of header line, presence /absence of vertical bar, loops & curves. A survey for handwritten character recognition wasproposed by R. Srihari [5] in 2000. Most of the work is focused on the recognition ofindividual characters, and a little attention has been paid towards the recognition of words,sentences or text. Recognition of words is difficult as words should be segmented intoindividual characters. In the present paper, we propose a fuzzy rule based classification andrecognition system for handwritten Hindi curve script words of two or three letters withoutmodifiers. Fuzzy logic is an organized method to solve problems dealing with vague, ambiguous,imprecise, noisy, or missing input data. The concept of fuzzy logic is first given by Dr. LotfiA. Zadeh in 1965[13]. According to Dr, Zadeh, fuzzy logic is a mathematical tool for dealingwith uncertainty. As compared to crisp logic that deals with precise values; it is a form ofmulti valued logic, which provides a way to deal with reasoning that is approximate. So itgives a machine a better mean to simulate human reasoning capabilities. Dealing withapproximation makes it appropriate for problems such as handwritten character recognition.This paper is organized in 5 sections. Section 2 throws some light on work done in the fieldof handwritten Hindi character recognition. Section 3 presents the proposed system. Section 4shows the experimental results. Finally conclusion is made in the last section.2. LITERATURE REVIEW Hanmandlu et al. [6] presented a fuzzy model based recognition system forhandwritten Hindi characters with 90.65% accuracy. The system works by performing coarseclassification of preprocessed character image by dividing it into 3x3 windows and thendetermining the presence and position of vertical bar. Then feature are extracted by applyingthe box approach. For recognition, an exponential variant of fuzzy membership function,constructed using the normalized vector distance, is used. Mukherjee and Rege [7] presenteda shape feature and fuzzy logic based offline handwritten character recognition system for thelanguage with 86.4% recognition rate. Structural features, such as end points, junction points,and adaptive thinning algorithm are used for segmenting characters into strokes. Then crispand fuzzy features are extracted for each stroke of the character. Two stage classification isperformed. Pre classification is performed using tree classifier in which characters areclassified based upon the presence and position of vertical line. Final classification andrecognition is performed using unordered stroke classification based on mean stroke features.In [8], a handwritten Hindi vowel character recognition system is presented, in which vowelsare segmented into five groups using projection approach. To extract the core characterheader line is removed by applying horizontal projection and modifiers are removed usingvertical projection. Feature extraction is done by using Invariant moments. Holambe andThool [9] presented a system for the recognition of printed and handwritten Devanagari scriptusing support vector machine and k-nearest neighbour classification technique. Singh, Mittaland Ghosh [10] perform estimation of Support vector machine with Radial basis function andk-nearest neighbour and achieved 93.8% accuracy. Two methods – curvelet transform &character geometry used for extracting features. 339
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME 3. PROPOSED SYSTEM The proposed system works in six stages: preprocessing, segmentation, normalization, classification, feature extraction and recognition. Flow diagram is shown in Fig.2. Preprocessing Start SlantThinning Binarization Correction Dilation Erosion Filtering Scanning Noise Reduction FeatureSegmentation Normalization Classification Extraction Recognition Figure 2. Flow diagram of the proposed 3.1 Preprocessing During preprocessing, a number of following operations are performed on the collected data to make it suitable for further processing— (i) Scanning— Handwritten word data samples, collected from various people, are scanned through an optical scanner or camera to convert data into a gray scale image. (ii) Noise Reduction-- Noise may be introduced in image during scanning, so to reduce noise following operations are performed: (a) Filtering—to reduce noise and false points, a nonlinear spatial filter- median filter is applied. Concept is to convolute a predefined mask with the image and replaces the value of the centre pixel by the median of intensity values in the neighbourhood of that pixel [14] (b) Dilation— there may be gaps in characters, which are filled by dilation using a structuring element [14]. (c) Erosion— to eliminate the spurious objects from the image, erosion is applied on it. (iii) Slant Correction— there are chances that characters in the word are inclined upwards or declined downwards, which makes feature extraction process difficult. For that, slant correction is done by using [ 12]. (iv) Binarization--In this paper, features are extracted from binary images of characters, so there is a need to convert the image to binary form. Global thresholding is applied for binarization. The method works by choosing a threshold value for the whole image and then sets the values of pixels to 1 whose value is greater than the threshold and 0 otherwise. (v) Thinning—Finally, binary image is thinned to single pixel width by the method presented in [11]. 340
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME3.2 SegmentationThinned image of word is segmented into individual characters by histogram equalization asfollowing— (i) First, horizontal histogram is taken to get the upper and lower boundary of the word. (ii) Then vertical histogram is taken to get the region of each character. (iii) A case occurs when number of regions is more than the number of characters in the word. It may be due to the presence of a character in which vertical bar is not connected to the character. In that case, the region of the vertical bar, with highest peak value, is considered to be a part of the character to its left.3.3 NormalizationBinary images of individual characters are normalized into 9x9.3.4 ClassificationAll Hindi language characters are made up of mainly three components: header line orSHIROREKHA, vertical bar, and curves. In the proposed method, we choose vertical barcomponent to classify characters. TABLE 1 shows the features (presence or absence, length,position, connectedness of vertical bar and number of junction points) on which basisdifferent classes of characters are formed. A character can belong to one class only. Table 1: Features used for classification Feature Symbol Values P (present) Presence of vertical bar VB NP(not present) M(middle) Position of vertical bar POS RE (right end) S (20%-30% of the character width W) Length of vertical bar LEN L(70%-80% of the character width W) Connectedness of vertical bar to C (connected) character CON NC (not connected) Number of junction points JP 1,2,3.4, or 5 A junction point is a point with 3 or more pixels in its neighbourhood .Method ofextracting these features is given in algorithm VERTICALBAR_INFO andJUNCTIONPOINT_COUNT. A movable 3X3 mask (Fig.3) is applied on the image, whichshows 8-neighborhood of the pixel P0. 341
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME P8 P1 P2 P7 P0 P3 P6 P5 P4 Figure 3: 3X3 maskIn these algorithms, following notations are used:CP -- current pixelCL -- current locationCOUNT_1 -- counter variable to count the number of pixels. Initial value is set to 0.COUNT_2 -- counter variable to count the number of junction points. Initial value is set to 0.ROW -- current row numberCOL -- current column numberAlgorithm VERTICALBAR_INFOTo determine the information about the vertical bar do the following: 1. Starting from the last column of the first row i.e. ROW==0 & COL==8, convolute the mask on the binary image of character and check: (i) IF pixel is a foreground pixel then call it as P0. IF number of neighbouring pixels of P0 ≥ 3 and one pixel is P5 then do the following -- (a) Set CP = P0. (b) Set N = COL. (c) Increase COUNT_1 by 1. (ii) ELSE move to next column to the left and repeat step (i) till COL ≥ 4 2. To identify the presence of vertical bar check the value of COUNT_1 IF COUNT_1 ==1 THEN VB is P ELSE VB is NP. 3. To identify the position of vertical bar check the value of N. IF N ≥ 8 THEN POS is RE ELSE POS is M 4. To identify the length and connectedness of vertical bar to character check POS. (i) IF POS==M THEN do the following till P5 is encountered (a) Set P5=P0 (b) Increase COUNT_1 by 1 (ii) IF COUNT_1 >3 THEN LEN is L ELSE LEN is S (iii) IF POS ==RE THEN Set CP=P0 and check the following till P5 is encountered IF P6 OR P7 OR P8 exists THEN CON is C ELSE CON is NC 342
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEAlgorithm NUM_JUNCTIONPOINTSTo determine the number of junction points do the following 1. Starting from the upper left corner pixel, convoluting the mask on the image from left to right. 2. Find the first foreground pixel P0 IF number of neighbouring pixels of P0 ≥ 3 THEN increase COUNT_2 by 1 ELSE P0=P3 3. Repeat step 2 till rightmost lower pixel is obtained. 4. Set JP=COUNT_2Using above mentioned algorithms, following fuzzy rules are formed to classify thecharacters into one of the eight classes. Flow process is shown in Fig.4.(i) IF VB == NP THEN character belongs to class A ( )(ii) IF VB == P AND POS == M AND LEN == L THEN character belongs to class B ( )(iii) IF VB == P AND POS == M AND LEN == S AND JP < 2 THEN character belongs to class C( )(iv) IF VB == P AND POS == M AND LEN == S AND JP ≥ 2THEN character belongs to class D ( )(v) IF VB == P AND POS== RE AND CON == NC THEN character belongs to class E ( )(vi) IF VB == P AND POS == RE AND CON == C AND JP <4 THEN character belongs to class F( )(vii) IF VB == P AND POS == RE AND CON == C AND JP ≥ 4 THEN character belongs to class G( ) 343
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME Read normalized image of size 9X9 VB : Vertical bar of the character A: Absent POS : Position of vertical bar RE : Right end Read presence of VB M: Middle Character LEN : Length of vertical bar belongs to class yes L : Large A If VB==A S: Small ( ) JP : Junction point no CON: Connectedness of vertical bar Read position of VB NC : Not connected yes If POS==RE Read connectedness of VB no Read length of VB yes Character If belongs to CON==NC class E Character yes ( ) belongs to class If LEN==L B ( ) no no Read value of JP Read value of JP Character belongs yes If If JP ≥ 4 to class D JP ≥2 ( ) no yes no Character belongs to Character belongs to Character belongs to class C class F class G ( ) ( ) ( ) Figure 4. Flow process of classification3.5 Feature ExtractionSteps for extracting features are given in following algorithm--Algorithm FEATURE_REC1. Remove header line by applying the following method- (i) Apply the 3X3 movable mask on the normalized image and scan the first row from right to left. (ii) IF pixel is a foreground pixel then call it P0. IF P7 is a foreground pixel OR P0 is an end point OR P0 is a disconnected component SET P0 = 0 ELSE move to the left pixel. 344
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME Image is scanned from right to left to avoid the deletion of character pixels in characters such as: because these characters, except , may be written in two ways— (a) header line covers the whole character and, (b) when header line covers only half or a portion of the character. In the first case, this step may result in deletion of pixels, which are common to header line and character, in characters mentioned above as well as characters such as and may produce some disconnected components with small number of pixels.2. Delete disconnected components as following-- (i) Scan the second row of the image from left to right. (ii) Find the first foreground pixel P0. (iii) IF P3 ==1 IF any pixel in 8 neighbourhood of P3 does not exists THEN SET P0=0 AND P3=0 ELSE IF P5==1 IF any pixel in 8 neighbourhood of P5 does not exists THEN SET P0=0 AND P5=0 Fig. 5 shows the process of deleting header line from character and its result. (a) (b) (c) Figure 5: (a) Character with header line, (b) Character without header line and disconnected component, (c) Character after removing disconnected component 3. Apply the 3X3 movable mask on the normalized image of classified character andscan the image from top to bottom row wise. Collect following information for junction points andend points-- (i) N1 : total number of junction points (ii) N2: total number of end points (iii) JPi : ith junction point, where i=1 to N1 (iv) EPi : ith end point where i=1 to N2 (v) Curve (JPi) : curve on ith junction point (Table 2) (vi) Curve (EPi) : curve on ith end point (vii) Line(JPi) : line on ith junction point (Table 2) 345
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME (viii) Line(EPi) : line on ith end point (ix) Loop(JPi) : loop on ith junction point (x) D1(i): direction of next endpoint from ith end point (xi) D2(i): direction of next junction point from ith junction point Values and symbols of different types of curves, lines & loops are given in the TABLE 2. Table 2: Values and symbols for curves, lines and loop Features Values Symbol Left Curve LC Upper left curve ULC Lower left curve LLC Curve Right curve RC Upper right curve URC Lower right curve LRC U curve U Vertical line VL Horizontal line HL Line Back slash BS Present P Loop Not present NPDifferent forms of above mentioned curves, lines and loops are shown in Fig. 6.In this code, following notations are used: PS -- Starting point CL -- current location CP -- Current pixel COUNT -- counter variable. Initial value is set to 0. 346
  • 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEAlgorithm CURVE_LINE_LOOP_INFOTo determine the nature of the curve do the following:Convolute the mask on the binary image of classified character from bottom to top row wise.Let P is the first foreground pixel. Call it current pixel (CP).1. If CP is a junction point or end point, then check the 8-neighbourhood of CP. (a) IF P1 is true THEN (i) Repeat till P1 is encountered (ii) Increase COUNT by 1. ELSE stop. (b) IF P3 is true THEN (i) Repeat till P3 is encountered (ii) Increase COUNT by 1. ELSE stop. (c) IF P8 is true THEN (i) Repeat till P8 is encountered (ii) Increase COUNT by 1. ELSE stop. (d) IF P1 OR P2 is true THEN (i) Repeat till P1 OR P2 is encountered (ii) Increase COUNT by 1. ELSE stop. (e) IF P1 OR P8 is true THEN (i) Repeat till P1 OR P8 is encountered (ii) Increase COUNT by 1. ELSE stop. (f) IF P2 OR P3 OR P4 is true THEN (i) Repeat till P2 OR P3 OR P4 encountered (ii) Increase COUNT by 1. ELSE stop. (g) IF P4 OR P5 is true THEN (i) Repeat till P4 OR P5 is encountered (ii) Increase COUNT by 1. ELSE stop. (h) IF P6 OR P7 OR P8 is true THEN (i) Repeat till P6 OR P7 OR P8 is encountered (ii) Increase COUNT by 1. ELSE stop.2. Check the following to know the type of curve and line: (i) IF step 1(h) is true IF step 1(a) is true IF step 1(f) is true IF COUNT ≥ 3 THEN Curve is LC 347
  • 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME (ii) ELSEIF step 1(e) is true IF step 1(f) is true IF COUNT ≥2 THEN Curve is ULC (iii) ELSEIF step1(h) is true IF step 1(e) is true IF COUNT ≥2 THEN Curve is LLC (iv) ELSE IF step 1(f) is true IF step 1(a) is true IF step 1 (h) is true IF COUNT ≥ 3 THEN Curve is RC. (v) ELSE IF step 1(d) is true IF step 1(h) is true IF COUNT ≥ 2 THEN Curve is URC. (vi) ELSE IF step 1 (f) is true IF step 1(e) is true IF COUNT ≥ 2 THEN Curve is LRC. (vii) ELSEIF step 1(g) is true IF step 1(h) OR step1 (f) is true IF step 1(d) is true IF COUNT ≥3 THEN Curve is U (viii) IF step 1(a) is true IF COUNT ≥ 2 THEN Line is VL (ix) IF step 1(b) is true IF COUNT ≥ 2 THEN Line is HL (x) IF step 1(c) is true IF COUNT ≥ 2 THEN Line is BS3. If CP is a junction point, then do the following to check the presence of loop: IF step 1(h) is true IF step 1(a) OR step 1 (g) is true IF step 1(f) is true IF Pi == CP AND COUNT ≥ 5 THEN Loop is P. 348
  • 13. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME CP (a) (b) (c) (d) (e) (f) (g) CP (h) (i) Figure 6 : Different types of curves : (a) Left curve (LC), (b) Upper left curve (ULC) , (c) Lowerleft curve (LLC), (d) Right curve (RC), (e) Upper right curve (URC), (f) Lower right curve (LRC), (g) U curve (U) , (h) Vertical line (VL), Horizontal line (HL), Backward slash (BS), (i) loop 349
  • 14. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME3.6 RecognitionFuzzy rules are used for recognition. Class wise rules applied for characters are: 1. IF Class is A IF Curve (EP1) == RC THEN character is ELSE IF Curve (JP1) ==LRC IF N2==4 OR D1 (3) == P3 THEN character is ELSE character is 2. IF Class is B IF Curve (EP2) == LC THEN character is ELSE IF Curve (EP2) == URC IF Curve (JP1) == LC OR Loop(JP1) ==P THEN character is ELSE character is 3. IF Class is C IF Curve (EP1) == LC THEN character is ELSE IF Curve (EP1) == RC IF N2==3 THEN character is ELSE character is 4. IF Class is D IF Curve (EP1) == LC THEN character is ELSE IF Curve (JP1) == LC IF N2 < 2 THEN character is ELSE IF N2==2 THEN character is ELSE character is ELSE IF Loop (JP1) ==P IF Curve (JP1) == RC OR URC THEN character is ELSE character is 5. IF Class is E IF Loop (JP1) ==P IF N1==2 THEN character is ELSE character is ELSE IF Curve (EP1) == U THEN character is 350
  • 15. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME 6. IF Class is F IF N2 > 3 IF Curve (EP1) == ULC THEN character is ELSE IF Curve (EP1) == RC OR Curve (EP2) == RC THEN character is ELSE character is ELSE IF N2==3 IF Curve (JP1) == LLC THEN character is ELSE IF Curve (JP1) ==U THEN character is ELSE IF Curve (EP1) == ULC THEN character is ELSE character is ELSE IF Curve (JP1) ==U THEN character is ELSE IF Curve (JP1) ==LLC THEN character is ELSE IF Curve (JP1) ==LC OR Loop (JP1) ==P THEN character is ELSE character is 7. IF Class is G IF N2>4 IF Curve (EP1) == RC THEN character is ELSE IF Line (EP1) == BS IF D2 (1) ==P3 OR D2(2)==P3 THEN character is ELSE character is ELSE IF N2 ==4 IF Loop on JP1 ==P THEN character is ELSE character is ELSE IF Curve (JP1) ==LLC OR U THEN character is ELSE IF Curve (JP1) == LC THEN character is ELSE IF Loop on JP1 ==P IF Loop on JP3 ==P OR LINE (EP2) == HL THEN character is ELSE character is 351
  • 16. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME Table 3: Summary of fuzzy rules for each character Class N1 N2 Curve(JP) Curve(EP) Line(JP) Line(EP) Loop(JP) D1 D2 D3 Character --- -- --- RC --- --- --- --- --- --- A --- --- --- --- LRC --- --- --- --- --- --- 4 --- --- LRC --- --- P3 --- --- --- --- --- LC --- --- --- --- --- --- B --- --- --- URC --- --- --- --- --- --- --- --- LC URC --- --- P --- --- --- --- --- --- LC --- --- --- --- --- --- C --- --- --- RC --- --- --- --- --- --- --- 3 --- RC --- --- --- --- --- --- --- --- --- LC --- --- --- --- --- --- --- <2 LC --- --- --- --- --- --- --- --- 2 LC --- --- --- --- --- --- --- D --- --- LC --- --- --- --- --- --- --- --- --- --- --- --- --- P --- --- --- --- --- RC OR --- --- --- P --- --- --- URC 2 --- --- --- --- --- P --- --- --- E --- --- --- --- --- --- P --- --- --- --- --- --- U --- --- --- --- --- --- --- >3 --- --- --- --- --- --- --- --- --- >3 --- ULC --- --- --- --- --- --- --- >3 --- RC --- --- --- --- --- --- --- 3 --- --- --- --- --- --- --- --- F --- 3 LLC --- --- --- --- --- --- --- --- 3 U --- --- --- --- --- --- --- --- 3 --- ULC --- --- --- --- --- --- --- <3 --- --- --- --- --- --- --- --- --- <3 U --- --- --- --- --- --- --- --- <3 LLC --- --- --- --- --- --- --- --- <3 LC --- --- --- P --- --- --- --- >4 --- --- --- --- --- --- --- --- --- >4 --- RC --- --- --- --- --- --- --- >4 --- --- --- BS --- P3 P3 --- --- 4 --- --- --- --- --- --- --- --- G --- 4 --- --- --- --- P --- --- --- --- <4 LLC OR --- --- --- --- --- --- --- U --- <4 LC --- --- --- --- --- --- --- --- <4 --- --- --- HL P --- --- --- --- <4 --- --- --- --- P --- --- --- 352
  • 17. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME4. EXPERIMENTAL RESULTS Dataset has been created by collecting handwritten word samples by 30 people ofdifferent age groups. Each person was asked to write 15 predecided words. A part of datasetis shown in the following figure— Figure 7: Word samples taken for experiment 353
  • 18. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEThese word samples were scanned, using a flat-bed scanner at 300dpi. Results of operationsperformed during recognition process on scanned image of word are shown in thefollowing figure. Original image Filtered image Eroded and dilated image Binarized image Thinned image Segmented image VB == P VB == NP VB == P POS == RE POS == RE CON == C CON == C Classification JP ≥ 4 JP < 4 Character belongs to class G A F Figure 8. Result of operations performed during preprocessing, segmentation and classification on sample word 354
  • 19. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEAfter classification, features mentioned in TABLE 2 are extracted for each character byapplying algorithm FEATURE_REC, which are then used at the time of recognition.Recognition rate for each word sample and for the proposed method is given in TABLE 4. Table 4. Average recognition rate of selected wordsSample Word Recognition Recognition Recognition Avg. rate rate of rate of recognition of character character 2 character 3 rate 1 S1 92.15% 94.08% 88.23% 91.48% S2 94% 90.11% 87.23% 90.44% S3 90.93% 97.26% 95.06% 94.41% S4 94.14% 90.17% 90% 91.43% S5 83.66% 93.96% 92.07% 89.89% S6 95% 93.48% 84.36% 90.94% S7 95.22% 92.01% 89.76% 92.33% S8 96.31% 92.45% 91.19% 93.31% S9 88.42% 92.31% 94.21% 91.64% S10 89.75% 83.52% 93.46% 88.91% S11 90.68% 88.99% ---------- 89.83% S12 96.29% 94.43% --------- 95.36% S13 88.57% 93.91% --------- 91.24% S14 96.81% 97.44% -------- 97.12% S15 87.41% 96.80% -------- 92.10% Overall Average Recognition Rate 92.02% 355
  • 20. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME January 98 96 94 92 90 Series1 88 86 84 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 Figure 9. Graphical representation of recognition rate of sample words5. CONCLUSION In this paper, we have present a novel method for classification and recognition of presented orsimple Hindi language two or three letter words without modifiers using fuzzy rule basedapproach. Characters are first classified into seven different classes and then recognized classwise. Few misclassification cases arise due to the presence of: some of the similar shape escharacters such as & and & , and characters which can be written in more than oneway such as & . We have extracted features for all the basic characters of the languagefor recognition process. Algorithms developed perform well and give fine results as the mostprominent features, such as vertical bar, curves, loops and lines, are used at classification and ,recognition stage. Experimental results verify the significance of the proposed system with of92.02% recognition rate. Fuzzy logic performs better than other methods as it can deal withimprecise, incomplete and vague data efficiently without losing any important information. Infuture, we will work to achieve better results and to improve the recognition rate byemphasizing more on characters having similar shape such as and on Hindi words withmodifiers.REFERENCESJournal Papers:[1]. N. Arica and F.T. Yarman-Vural, An overview of character recognition focused on Yarman off line hand writing, C99-06-C-203, 2000,IEEE. C99[2]. I.K. Sethi, and B. Chatterjee, Machine recognition of constrained hand printed rinted Devnagari, pattern recognition, vol. 9, no. 2, 1977, pp.69 – 75. ,[3]. R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEE IEEE Trans. System, Man Cybern. 9,1979, 435-441. 435[4]. S. Palit, B.B. Chaudhuri, P.P. Das, B.N. Chatterjee, Pattern Recognition, Image Processing and Computer Vision, Narosa Publishing House, India,1995,163 1995,163-168. 356
  • 21. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME[5] R. Plamondon and S. N. Srihari, “On-line and off-line handwriting recognition: A comprehensive survey”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22(1), 2000, pp63–84.[6] M. Hanmandlu, O.V. R. Murthy and V. K. Madasu, fuzzy model based recognition of handwritten Hindi characters, 0-7695-3067-2/07, 2007,IEEE.[7] P. Mukerji and P.P. Rege, Shape Feature and Fuzzy Logic Based Offline Devnagari Handwritten Optical Character Recognition, Journal of Pattern Recognition Research 4, 2009, 52-68.[8] R.J.Ramteke, Invariant moments based feature extraction for handwritten Devnagari vowel recognition, IJCA, ( 0975-8887) Vol 1 – No. 18., 2010.[9] A. N. Holambe, R.C.Thool , Printed and handwritten character & number recognition of Devanagari script using SVM and KNN, Int. Journal of Recent Trends in Engineering and Technology, Vol. 3, No. 2, May 2010[10] B. Singh, A. Mittal and D. Ghosh, An evaluation of different feature extractors and Classifiers for offline handwritten Devnagari character recognition, Journal of Pattern Recognition Research 2, 2011, 269-277.[11] A. Pokhriyal and S. Lehri, MERIT: Minutiae Extraction Using Rotation Invariant Thinning. International Journal of Engineering Science & Technology, vol. 2(7), 2010, 3225-3235.[12] Primekumar K.P and Sumam Mary Idicula, “Performance Of On-Line Malayalam Handwritten character Recognition Using HMM and SFAM” International journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 115 - 125, Published by IAEMEProceeding Papers:[12] P. Mukherji, P. P. Rege and L. K. Pradhan, Analytical Verification System for Handwritten Devnagari Script. Proceedings of the Sixth IASTED VIIP, pp. 237-242, Palma DeMallorca, Spain, August,2006.Books:[13] S.N. Sivanandam and S. N. Deepa, Principles of Soft Computing (Second Edition, Wiley-India)[14] R.C. Gonzales and R.E.Woods, Digital Image Processing (Second Edition, Prentice Hall) 357