SlideShare a Scribd company logo
1 of 3
Download to read offline
International Journal of Modern Engineering Research (IJMER)
               www.ijmer.com         Vol.2, Issue.6, Nov-Dec. 2012 pp-4091-4093       ISSN: 2249-6645

          Character Recognition System Based On Android Smart Phone
                          Soon-kak Kwon1, Hyun-jun An2, Young-hwan Choi, 3
                         Department of Computer Software Engineering, Dongeui University, Korea

 ABSTRACT : In this paper, we propose the character                II.          PROPOSED CHARACTER RECOGNITION
 recognition method using optical character reader                                         METHOD
 technology for the smart phone application. The camera                      In order to create a character recognition system
 within Android smart phone captures the document and              based on Android smart phone, Tesseract-OCR [2] and
 then the OCR is applied according to language database.           Mezzofanti were used. Tesseract-OCR 2.03 version supports
 As some language is added to the database, the character of       some languages such as English. From the 3.00 version, the
 the various languages can be easily recognized. From              Korean language is supported. It internally utilizes the
 simulation results, we can see the results of tests in English,   image processing library called as leptonica. Tesseract-OCR
 Korean, Japanese, Chinese recognition.                            is used in AOSP (Android Open Source Project) and eyes-
                                                                   free project. Mezzofanti is open-source Android Appication.
Keywords: Character recognition, Smart phone, Picture              It recognizes the characters in the image taken from the
Quality                                                            camera by using the Tesseract library. The app currently is
                                                                   in version 1.0.3 and uses Tesseract 2.03 version, so it has the
                I.    INTRODUCTION                                 disadvantage that does not support many languages
          Current OCR (optical character reader) technology        recognition. Therefore, we implement Mezzofanti as
[1-6] is widely applied to Zip recognition, product                Tesseract 3.0 version. Tesseract 3.0 is build to NDK and
                                                                   then the source code of the packages associated with the
inspection and classification, document recognition, vehicle
                                                                   Tesseract is downloaded. Mezzofanti source should be
number recognition, drawings recognition, slips and checks
automatically entering. OCR technology in the United States        modified. Eclipse or Ant is used to build Mezzofanti.
                                                                             Mezzofanti application and dictionary and pre-
of America was finished in one-stage from starting in the
1950s to early 1970s. In the 1980s, two-stage neural               learning data files have to be installed. Mezzofanti for
network and VLSI design were progressed. OCR technology            Eclipse or Ant is to build, and installed in Android smart
of Japan began to develop Zip automatic recognition device         phone. Mezzofanti and Tesseract can add simply a language
                                                                   that is not supported by default; the database only for that
in the 1960s. Since 1966, Pattern Information Processing
                                                                   language can be easily applied. Proposed system can support
System project became the instrument of development by
                                                                   full mode and line mode. Full mode is to be recognized for
participating many companies.
                                                                   the entire document, and lines mode can be recognized for
          On-lain OCR research to recognize at the same
                                                                   one line of document. The case of the character
time with handwriting was the first attempt in 1959. There
                                                                   misrecognized in results screen can be modified separately;
are many character recognition system such as mail sorter of
                                                                   it can increase the recognition rate.
Germany's Siemens, auto-inspection system of Japan's NEC,
face, fingerprint recognition, document processing system of
USA’s National Institute of Science and Technology, the            Fig.1 shows the screen shots for the proposed recognition
field of artificial intelligence, document pattern recognition /   system based on smart phone.
analysis, automatic processing of checks, number / word /
string recognition of Canada Concordia University.
          Most of the character recognition program will be
recognized through the input image with a scanner or a
digital camera and computer software. There is a problem in
the spatial size of the computer and scanner. If you do not
have a scanner and a digital camera, a hardware problem
occurs. In order to overcome the limitations of computer
occupying a large space, character recognition system based
on smart phone is proposed. Character recognition software
developed by smart phones with an emphasis on mobility
and portability, spatial, hardware, financial limitations can                              (a) Full mode
be solved. But because the performances of smart phone and
computer are different, the speed of massive character
recognition is slow. Hardware speeds up the development of
smart phones; this issue seems to be resolved as soon as
possible. In this paper, the character recognition method is
presented by using OCR technology and smart phone.
          The organization of this paper is as follows.
Section II provides the proposed character recognition
method. Section III shows the simulation results of the
proposed method about recognition rate for various                                         (b) Line mode
languages. In Section IV, we summarize the main results.

                                                      www.ijmer.com                                                  4091 | Page
International Journal of Modern Engineering Research (IJMER)
                www.ijmer.com             Vol.2, Issue.6, Nov-Dec. 2012 pp-4091-4093           ISSN: 2249-6645
Fig 1. Screen shots for the proposed recognition system Table I. Recognition for English
based on smart phone
                                                                                                 Number of      Recognition
                                                                        Document type
      button : Line mode and Full mode can be changed. In                                        characters          rate
line mode only recognizes the letters inside the white area.                 Courier                 20            82.7 %
Recognizing the character of the white areas, then move on               Dark lighting               10            13.1 %
to the results screen.                                                        Tilt                    7            66.4 %
                                                                          Special fonts               5             9.3 %
      button : Recognize the letters of the screen.                      Small font size              5            54.2 %
The recognition result is shown as follows. Fig 2 shows                Wide letter space             14            75.1 %
original document with English language. After being                  Narrow letter space            14            51.8 %
captured by smart phone camera, the data is processed by
binarization for object segmentation. Then we can see the                  Because English is the completion character, this
recognition results on the screen of smart phone as shown in reason seems to affect recognition rates. Result of testing in
Fig. 4.                                                           a dark place is not good. The following test was tilted letters.
                                                                  If we exceed 30 ° inclined angle of characters, the
                                                                  recognition success rate was too low. So, the characters of
                                                                  10 ° ~ 30 ° were tested. The case of a sentence of less than
                                                                  30 ° was not recognized satisfactorily but recognized in
                                                                  some degree. Spatial fonts are required according to the
                                                                  shape in the font database. A relatively simple form of the
                                                                  character in the simulation of the character size was
                                                                  accurately recognized, but the case of 'Q, R, G, M' seemed
                                                                  to be difficult to recognize. A broad statement of the
     Fig. 2. Example of original document for recognition
                                                                  characters interval showed good result instead of relatively
                                                                  narrow letter spacing.
                                                                  2) Korean
                                                                  Table II shows the recognition rate for Korean language.

                                                                  Table II. Recognition for Korean

                                                                                                  Number of       Recognition
                                                                         Document type
                                                                                                  characters         rate
            Fig. 3. Binarization of captured data
                                                                            Courier                  20             65.4 %
                                                                         Dark lighting               10             5.2 %
                                                                              Tilt                    7             43.1 %
                                                                          Special fonts               5             4.8 %
                                                                         Small font size              5             44.2 %
                                                                        Wide letter space            14             64.7 %
                                                                       Narrow letter space           14             39.8 %

                                                                            Test results showed good recognition results.
                                                                  Character recognition rate was high for relatively simple
                                                                  characters consisting of consonants and vowels. Character
   Fig. 4. Screen shot of an example of recognition result        combination of Consonant + vowel + consonant into 3
                                                                  pieces showed less recognition rate, depending on the form
           III.     SIMULATION RESULTS                            of the characters. According to the intensity of the light, the
         For performance of recognition, we simulate the
                                                                  results were similar to the English test. Similarly, the slope
proposed character recognition system to various characters
                                                                  of the characters beyond 30 ° was difficult to obtain good
such as English, Korean, Japanese, and Chinese.
                                                                  recognition rate. Some credibility to test unusual fonts could
We calculate the recognition rate (R) as performance              fall. Unusual font standards were vague. Recognition
criterion of recognition;                                         success rate was low. The end of characters are made up of
                                                                  straight lines gave satisfactory recognition results but special
        Number of correctly recognized character                  fonts were difficult to be recognized. Fonts and characters in
   R
             Total number of character                            order to recognize should be added to the database for each
                                                                  font. The sentence of a wide letter spacing was recognized
1) English                                                        without difficulty, but for a sentence of between narrow
English language was recognized relatively high. We can           characters, depending on the spacing between the characters,
see the recognition rate from the Table I.                        the result was much different. Test showed slightly different

                                                     www.ijmer.com                                                   4092 | Page
International Journal of Modern Engineering Research (IJMER)
                www.ijmer.com            Vol.2, Issue.6, Nov-Dec. 2012 pp-4091-4093          ISSN: 2249-6645
results, depending on the characteristics of each character. test were progressed in Chinese Simplified character based
The character having the white space to the right of the on such as Times New Roman, Courier fonts. Tesseract
letters such as “ㅏ, ㅑ” showed some results, but vice versa database was used for Chinese Simplified testing. Chinese
was somewhat different.                                          character forms of Korean similarly, but the number of
3) Japanese                                                      combinations of letters surpasses number of characters for
Table III shows the recognition rate for Japanese language.      each of Korean consonants, vowels. Documents of Courier,
                                                                 wide margins between characters, and simple form were
Table II. Recognition for Japanese                               similarly recognized relatively good. Because Kanji is
                                                                 complex, even if successful, the recognition rate was not
                               Number of       Recognition       high.
        Document type
                                characters          rate
           Courier                 20             58.8 %                          IV.    CONCLUSION
                                                                          In this paper, character recognition system was
         Dark lighting             10              6.2 %
                                                                 implemented by using the Android smart phones. The
             Tilt                   7             31.5 %         implementation process of the system was described to
         Special fonts              5             12.1 %         recognize the characters in the document using the camera
        Small font size             5             20.3 %         screen.
                                                                          Photo data taken by a smart phone can be
       Wide letter space           14             49.2 %
                                                                 compared with the database of the system, then the
      Narrow letter space          14             35.9 %         characters can be recognized, the recognized character can
                                                                 be created to a text file to take advantage of the as
          The case of Japanese character is also tested under applications of the Internet and pre-retrieval and various
the same conditions such as Korean, English. Testing fonts strategies Smart phones character recognition system does
are Times New Roman, Courier. Japanese forms a not need hardware such as a computer or a scanner.
completion character as Hiragana and Katakana or a Therefore, there are the advantages that the recognition
combination character as Kanji. So the recognition rate cannot be spatially restricted and simple character
depends on the combination of recognized characters. recognition is possible.
Completion characters such as Hiragana and Katakana
showed a high recognition rate. However, Chinese                             V.     ACKNOWLEDGEMENTS
characters that are diverse in form had a problem to be This work was supported by Dongeui University Foundation
recognized. Except in the case of the simple Kanji, Kanji Grant (2012AA180). Corresponding author: Soon-kak
gets complicated higher, recognition rate dropped Kwon
accordingly. The number of Japanese Kanji was much
smaller compared to the comparison of Chinese Kanji.                                   REFERENCES
Nevertheless, similar to the appearance of a lot of Kanji, this [1] J.L. Blue, G.T. Candela, P.J. Grother, R. Chellappa,
problem seems to have no other choice. Recognition results           C.L. Wilson, Evaluation of pattern classifiers for
showed the same results that are similar to the previous             fingerprint and OCR applications, Pattern Recognition,
simulation of Korean, English. Chronic problem was still at          27(4), 1994, 485–501.
dark lighting, tilted characters. Specially it was difficult [2] R. Smith, An overview of the Tesseract OCR engine,
recognize the Kanji. Simulation according to the font size           Proc. Int. Conf. Document Analysis and Recognition
was also dropped quite a bit from the Kanji recognition. The         (ICDAR 2007), 2007, 629-633.
spacing of letters has yielded good results in relatively, it [3] M. Seeger, Binarising camera images for OCR, Proc.
still seems to recognition success rate varies depending on          Int. Conf. Document Analysis and Recognition), 2001,
the form of the Kanji.                                               54-58.
4) Chinese                                                       [4] K. Wang, J. A. kangas, Character location in scene
Table IV shows the recognition rate for Chinese language.            images from digital camera, Pattern Recognition.,
                                                                     36(10), 2003, 2287-2299.
Table II. Recognition for Chinese                                [5] C. C. Chang, S. M. Hwang, D. J. Buehrer, A shape
                                                                     recognition scheme based on relative distances of
                               Number of       Recognition           feature points from the centroid. Pattern Recognition,
        Document type
                                characters          rate             24(11), 1991. 1053-1063.
           Courier                 20             41.4 %         [6] T. Bernier, J.-A. Landry, A new method for
         Dark lighting             10              3.8 %             representing and matching shapes of natural objects.
                                                                     Pattern Recognition, 36(8), 2003. 1711-1723.
             Tilt                   7             25.2 %
         Special fonts              5                 -
        Small font size             5             11.3 %
       Wide letter space           14             40.0 %
    Narrow letter space           14           17.9 %

       Recognition Chinese Kanji was tested under the
same conditions for other languages. Chinese recognition

                                                  www.ijmer.com                                                4093 | Page

More Related Content

Viewers also liked

презентация уко
презентация укопрезентация уко
презентация уко
uca51
 
C04010 02 1519
C04010 02 1519C04010 02 1519
C04010 02 1519
IJMER
 
Marketing tips
Marketing tipsMarketing tips
Marketing tips
denise2228
 
What is twitter
What is twitterWhat is twitter
What is twitter
denise2228
 
Cu31420423
Cu31420423Cu31420423
Cu31420423
IJMER
 
Writing for pr #3
Writing for pr #3Writing for pr #3
Writing for pr #3
Dicky Ahmad
 
Gi2429352937
Gi2429352937Gi2429352937
Gi2429352937
IJMER
 
Digital foot print
Digital foot printDigital foot print
Digital foot print
eringolden24
 
At32715720
At32715720At32715720
At32715720
IJMER
 
Vision of Daniel 8 - Part 2
Vision of Daniel 8 - Part 2Vision of Daniel 8 - Part 2
Vision of Daniel 8 - Part 2
Robert Taylor
 
Bo31240249
Bo31240249Bo31240249
Bo31240249
IJMER
 
Dv31587594
Dv31587594Dv31587594
Dv31587594
IJMER
 

Viewers also liked (16)

презентация уко
презентация укопрезентация уко
презентация уко
 
C04010 02 1519
C04010 02 1519C04010 02 1519
C04010 02 1519
 
Evaluation of Compressive Strength and Water Absorption of Styrene Butadiene...
Evaluation of Compressive Strength and Water Absorption of  Styrene Butadiene...Evaluation of Compressive Strength and Water Absorption of  Styrene Butadiene...
Evaluation of Compressive Strength and Water Absorption of Styrene Butadiene...
 
Marketing tips
Marketing tipsMarketing tips
Marketing tips
 
Karu.form
Karu.formKaru.form
Karu.form
 
What is twitter
What is twitterWhat is twitter
What is twitter
 
Cu31420423
Cu31420423Cu31420423
Cu31420423
 
Writing for pr #3
Writing for pr #3Writing for pr #3
Writing for pr #3
 
Experimental Study of the Fatigue Strength of Glass fiber epoxy and Chapstan ...
Experimental Study of the Fatigue Strength of Glass fiber epoxy and Chapstan ...Experimental Study of the Fatigue Strength of Glass fiber epoxy and Chapstan ...
Experimental Study of the Fatigue Strength of Glass fiber epoxy and Chapstan ...
 
Gi2429352937
Gi2429352937Gi2429352937
Gi2429352937
 
Digital foot print
Digital foot printDigital foot print
Digital foot print
 
An Efficient System for Traffic Control in Networks Using Virtual Routing Top...
An Efficient System for Traffic Control in Networks Using Virtual Routing Top...An Efficient System for Traffic Control in Networks Using Virtual Routing Top...
An Efficient System for Traffic Control in Networks Using Virtual Routing Top...
 
At32715720
At32715720At32715720
At32715720
 
Vision of Daniel 8 - Part 2
Vision of Daniel 8 - Part 2Vision of Daniel 8 - Part 2
Vision of Daniel 8 - Part 2
 
Bo31240249
Bo31240249Bo31240249
Bo31240249
 
Dv31587594
Dv31587594Dv31587594
Dv31587594
 

Similar to Character Recognition System Based On Android Smart Phone

Intelligent speech based sms system
Intelligent speech based sms systemIntelligent speech based sms system
Intelligent speech based sms system
Kamal Spring
 
Ask mom updated submitted april 2nd
Ask mom updated submitted april 2ndAsk mom updated submitted april 2nd
Ask mom updated submitted april 2nd
Nadia Millington
 
Ask mom updated submitted april 2nd
Ask mom updated submitted april 2ndAsk mom updated submitted april 2nd
Ask mom updated submitted april 2nd
Nadia Millington
 
Mobile camera based text detection and translation
Mobile camera based text detection and translationMobile camera based text detection and translation
Mobile camera based text detection and translation
Vivek Bharadwaj
 
Word Detection & Translation from image on an android device
Word Detection & Translation from image on an android deviceWord Detection & Translation from image on an android device
Word Detection & Translation from image on an android device
Ritwik Kumar
 

Similar to Character Recognition System Based On Android Smart Phone (20)

Signboard Text Translator: A Guide to Tourist
Signboard Text Translator: A Guide to TouristSignboard Text Translator: A Guide to Tourist
Signboard Text Translator: A Guide to Tourist
 
O45018291
O45018291O45018291
O45018291
 
A Translation Device for the Vision Based Sign Language
A Translation Device for the Vision Based Sign LanguageA Translation Device for the Vision Based Sign Language
A Translation Device for the Vision Based Sign Language
 
Intelligent speech based sms system
Intelligent speech based sms systemIntelligent speech based sms system
Intelligent speech based sms system
 
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCRA SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
 
Ask mom updated submitted april 2nd
Ask mom updated submitted april 2ndAsk mom updated submitted april 2nd
Ask mom updated submitted april 2nd
 
Ask mom updated submitted april 2nd
Ask mom updated submitted april 2ndAsk mom updated submitted april 2nd
Ask mom updated submitted april 2nd
 
Online Hand Written Character Recognition
Online Hand Written Character RecognitionOnline Hand Written Character Recognition
Online Hand Written Character Recognition
 
Gyropen report
Gyropen reportGyropen report
Gyropen report
 
IRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using Android
 
Mobile camera based text detection and translation
Mobile camera based text detection and translationMobile camera based text detection and translation
Mobile camera based text detection and translation
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Mob ocr
Mob ocrMob ocr
Mob ocr
 
Handwriting recognition on Livescribe smartpen
Handwriting recognition on Livescribe smartpenHandwriting recognition on Livescribe smartpen
Handwriting recognition on Livescribe smartpen
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting Recognition
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
 
Sign Language Recognition using Mediapipe
Sign Language Recognition using MediapipeSign Language Recognition using Mediapipe
Sign Language Recognition using Mediapipe
 
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
 
Word Detection & Translation from image on an android device
Word Detection & Translation from image on an android deviceWord Detection & Translation from image on an android device
Word Detection & Translation from image on an android device
 
Real Time Sign Language Detection
Real Time Sign Language DetectionReal Time Sign Language Detection
Real Time Sign Language Detection
 

More from IJMER

Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
IJMER
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...
IJMER
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
IJMER
 

More from IJMER (20)

A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...
 
Developing Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingDeveloping Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed Delinting
 
Study & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreStudy & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja Fibre
 
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
 
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
 
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
 
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
 
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
 
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationStatic Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
 
High Speed Effortless Bicycle
High Speed Effortless BicycleHigh Speed Effortless Bicycle
High Speed Effortless Bicycle
 
Integration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIntegration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise Applications
 
Microcontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemMicrocontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation System
 
On some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesOn some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological Spaces
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine Learning
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
 
Studies On Energy Conservation And Audit
Studies On Energy Conservation And AuditStudies On Energy Conservation And Audit
Studies On Energy Conservation And Audit
 
An Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLAn Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDL
 
Discrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyDiscrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One Prey
 

Character Recognition System Based On Android Smart Phone

  • 1. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol.2, Issue.6, Nov-Dec. 2012 pp-4091-4093 ISSN: 2249-6645 Character Recognition System Based On Android Smart Phone Soon-kak Kwon1, Hyun-jun An2, Young-hwan Choi, 3 Department of Computer Software Engineering, Dongeui University, Korea ABSTRACT : In this paper, we propose the character II. PROPOSED CHARACTER RECOGNITION recognition method using optical character reader METHOD technology for the smart phone application. The camera In order to create a character recognition system within Android smart phone captures the document and based on Android smart phone, Tesseract-OCR [2] and then the OCR is applied according to language database. Mezzofanti were used. Tesseract-OCR 2.03 version supports As some language is added to the database, the character of some languages such as English. From the 3.00 version, the the various languages can be easily recognized. From Korean language is supported. It internally utilizes the simulation results, we can see the results of tests in English, image processing library called as leptonica. Tesseract-OCR Korean, Japanese, Chinese recognition. is used in AOSP (Android Open Source Project) and eyes- free project. Mezzofanti is open-source Android Appication. Keywords: Character recognition, Smart phone, Picture It recognizes the characters in the image taken from the Quality camera by using the Tesseract library. The app currently is in version 1.0.3 and uses Tesseract 2.03 version, so it has the I. INTRODUCTION disadvantage that does not support many languages Current OCR (optical character reader) technology recognition. Therefore, we implement Mezzofanti as [1-6] is widely applied to Zip recognition, product Tesseract 3.0 version. Tesseract 3.0 is build to NDK and then the source code of the packages associated with the inspection and classification, document recognition, vehicle Tesseract is downloaded. Mezzofanti source should be number recognition, drawings recognition, slips and checks automatically entering. OCR technology in the United States modified. Eclipse or Ant is used to build Mezzofanti. Mezzofanti application and dictionary and pre- of America was finished in one-stage from starting in the 1950s to early 1970s. In the 1980s, two-stage neural learning data files have to be installed. Mezzofanti for network and VLSI design were progressed. OCR technology Eclipse or Ant is to build, and installed in Android smart of Japan began to develop Zip automatic recognition device phone. Mezzofanti and Tesseract can add simply a language that is not supported by default; the database only for that in the 1960s. Since 1966, Pattern Information Processing language can be easily applied. Proposed system can support System project became the instrument of development by full mode and line mode. Full mode is to be recognized for participating many companies. the entire document, and lines mode can be recognized for On-lain OCR research to recognize at the same one line of document. The case of the character time with handwriting was the first attempt in 1959. There misrecognized in results screen can be modified separately; are many character recognition system such as mail sorter of it can increase the recognition rate. Germany's Siemens, auto-inspection system of Japan's NEC, face, fingerprint recognition, document processing system of USA’s National Institute of Science and Technology, the Fig.1 shows the screen shots for the proposed recognition field of artificial intelligence, document pattern recognition / system based on smart phone. analysis, automatic processing of checks, number / word / string recognition of Canada Concordia University. Most of the character recognition program will be recognized through the input image with a scanner or a digital camera and computer software. There is a problem in the spatial size of the computer and scanner. If you do not have a scanner and a digital camera, a hardware problem occurs. In order to overcome the limitations of computer occupying a large space, character recognition system based on smart phone is proposed. Character recognition software developed by smart phones with an emphasis on mobility and portability, spatial, hardware, financial limitations can (a) Full mode be solved. But because the performances of smart phone and computer are different, the speed of massive character recognition is slow. Hardware speeds up the development of smart phones; this issue seems to be resolved as soon as possible. In this paper, the character recognition method is presented by using OCR technology and smart phone. The organization of this paper is as follows. Section II provides the proposed character recognition method. Section III shows the simulation results of the proposed method about recognition rate for various (b) Line mode languages. In Section IV, we summarize the main results. www.ijmer.com 4091 | Page
  • 2. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol.2, Issue.6, Nov-Dec. 2012 pp-4091-4093 ISSN: 2249-6645 Fig 1. Screen shots for the proposed recognition system Table I. Recognition for English based on smart phone Number of Recognition Document type button : Line mode and Full mode can be changed. In characters rate line mode only recognizes the letters inside the white area. Courier 20 82.7 % Recognizing the character of the white areas, then move on Dark lighting 10 13.1 % to the results screen. Tilt 7 66.4 % Special fonts 5 9.3 % button : Recognize the letters of the screen. Small font size 5 54.2 % The recognition result is shown as follows. Fig 2 shows Wide letter space 14 75.1 % original document with English language. After being Narrow letter space 14 51.8 % captured by smart phone camera, the data is processed by binarization for object segmentation. Then we can see the Because English is the completion character, this recognition results on the screen of smart phone as shown in reason seems to affect recognition rates. Result of testing in Fig. 4. a dark place is not good. The following test was tilted letters. If we exceed 30 ° inclined angle of characters, the recognition success rate was too low. So, the characters of 10 ° ~ 30 ° were tested. The case of a sentence of less than 30 ° was not recognized satisfactorily but recognized in some degree. Spatial fonts are required according to the shape in the font database. A relatively simple form of the character in the simulation of the character size was accurately recognized, but the case of 'Q, R, G, M' seemed to be difficult to recognize. A broad statement of the Fig. 2. Example of original document for recognition characters interval showed good result instead of relatively narrow letter spacing. 2) Korean Table II shows the recognition rate for Korean language. Table II. Recognition for Korean Number of Recognition Document type characters rate Fig. 3. Binarization of captured data Courier 20 65.4 % Dark lighting 10 5.2 % Tilt 7 43.1 % Special fonts 5 4.8 % Small font size 5 44.2 % Wide letter space 14 64.7 % Narrow letter space 14 39.8 % Test results showed good recognition results. Character recognition rate was high for relatively simple characters consisting of consonants and vowels. Character Fig. 4. Screen shot of an example of recognition result combination of Consonant + vowel + consonant into 3 pieces showed less recognition rate, depending on the form III. SIMULATION RESULTS of the characters. According to the intensity of the light, the For performance of recognition, we simulate the results were similar to the English test. Similarly, the slope proposed character recognition system to various characters of the characters beyond 30 ° was difficult to obtain good such as English, Korean, Japanese, and Chinese. recognition rate. Some credibility to test unusual fonts could We calculate the recognition rate (R) as performance fall. Unusual font standards were vague. Recognition criterion of recognition; success rate was low. The end of characters are made up of straight lines gave satisfactory recognition results but special Number of correctly recognized character fonts were difficult to be recognized. Fonts and characters in R Total number of character order to recognize should be added to the database for each font. The sentence of a wide letter spacing was recognized 1) English without difficulty, but for a sentence of between narrow English language was recognized relatively high. We can characters, depending on the spacing between the characters, see the recognition rate from the Table I. the result was much different. Test showed slightly different www.ijmer.com 4092 | Page
  • 3. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol.2, Issue.6, Nov-Dec. 2012 pp-4091-4093 ISSN: 2249-6645 results, depending on the characteristics of each character. test were progressed in Chinese Simplified character based The character having the white space to the right of the on such as Times New Roman, Courier fonts. Tesseract letters such as “ㅏ, ㅑ” showed some results, but vice versa database was used for Chinese Simplified testing. Chinese was somewhat different. character forms of Korean similarly, but the number of 3) Japanese combinations of letters surpasses number of characters for Table III shows the recognition rate for Japanese language. each of Korean consonants, vowels. Documents of Courier, wide margins between characters, and simple form were Table II. Recognition for Japanese similarly recognized relatively good. Because Kanji is complex, even if successful, the recognition rate was not Number of Recognition high. Document type characters rate Courier 20 58.8 % IV. CONCLUSION In this paper, character recognition system was Dark lighting 10 6.2 % implemented by using the Android smart phones. The Tilt 7 31.5 % implementation process of the system was described to Special fonts 5 12.1 % recognize the characters in the document using the camera Small font size 5 20.3 % screen. Photo data taken by a smart phone can be Wide letter space 14 49.2 % compared with the database of the system, then the Narrow letter space 14 35.9 % characters can be recognized, the recognized character can be created to a text file to take advantage of the as The case of Japanese character is also tested under applications of the Internet and pre-retrieval and various the same conditions such as Korean, English. Testing fonts strategies Smart phones character recognition system does are Times New Roman, Courier. Japanese forms a not need hardware such as a computer or a scanner. completion character as Hiragana and Katakana or a Therefore, there are the advantages that the recognition combination character as Kanji. So the recognition rate cannot be spatially restricted and simple character depends on the combination of recognized characters. recognition is possible. Completion characters such as Hiragana and Katakana showed a high recognition rate. However, Chinese V. ACKNOWLEDGEMENTS characters that are diverse in form had a problem to be This work was supported by Dongeui University Foundation recognized. Except in the case of the simple Kanji, Kanji Grant (2012AA180). Corresponding author: Soon-kak gets complicated higher, recognition rate dropped Kwon accordingly. The number of Japanese Kanji was much smaller compared to the comparison of Chinese Kanji. REFERENCES Nevertheless, similar to the appearance of a lot of Kanji, this [1] J.L. Blue, G.T. Candela, P.J. Grother, R. Chellappa, problem seems to have no other choice. Recognition results C.L. Wilson, Evaluation of pattern classifiers for showed the same results that are similar to the previous fingerprint and OCR applications, Pattern Recognition, simulation of Korean, English. Chronic problem was still at 27(4), 1994, 485–501. dark lighting, tilted characters. Specially it was difficult [2] R. Smith, An overview of the Tesseract OCR engine, recognize the Kanji. Simulation according to the font size Proc. Int. Conf. Document Analysis and Recognition was also dropped quite a bit from the Kanji recognition. The (ICDAR 2007), 2007, 629-633. spacing of letters has yielded good results in relatively, it [3] M. Seeger, Binarising camera images for OCR, Proc. still seems to recognition success rate varies depending on Int. Conf. Document Analysis and Recognition), 2001, the form of the Kanji. 54-58. 4) Chinese [4] K. Wang, J. A. kangas, Character location in scene Table IV shows the recognition rate for Chinese language. images from digital camera, Pattern Recognition., 36(10), 2003, 2287-2299. Table II. Recognition for Chinese [5] C. C. Chang, S. M. Hwang, D. J. Buehrer, A shape recognition scheme based on relative distances of Number of Recognition feature points from the centroid. Pattern Recognition, Document type characters rate 24(11), 1991. 1053-1063. Courier 20 41.4 % [6] T. Bernier, J.-A. Landry, A new method for Dark lighting 10 3.8 % representing and matching shapes of natural objects. Pattern Recognition, 36(8), 2003. 1711-1723. Tilt 7 25.2 % Special fonts 5 - Small font size 5 11.3 % Wide letter space 14 40.0 % Narrow letter space 14 17.9 % Recognition Chinese Kanji was tested under the same conditions for other languages. Chinese recognition www.ijmer.com 4093 | Page