SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 516
Image to Text Conversion Using Tesseract
Nisha Pawar1, Zainab Shaikh2, Poonam Shinde3, Prof. Y.P. Warke4
1,2,3,4Dept. of Computer Engineering, Marathwada Mitra Mandal’s Institute of Technology, Maharashtra, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - In the current world, there isagreatincreaseinthe
utilization of digital technology and various methods are
available for the people to capture images. Such images may
contain important textual data that the user may need to edit
or store digitally. This can be done using Optical Character
Recognition with the help of Tesseract OCR Engine. OCR is a
branch of artificial intelligence that is used in applications to
recognize text from scanned documents or images. The
recognized text can also be converted to audio format to help
visually impaired people hear the information that they wish
to know.
Key Words: Artificial Intelligence, Optical Character
Recognition, Tesseract, text-to-speech
1. INTRODUCTION
Textual information is available in many resources such as
documents, newspapers, faxes, printed information, written
notes, etc. Many people simply scan the document to store
the data in the computers. When a documentisscannedwith
a scanner, it is stored in the form of images. Buttheseimages
are not editable and it is very difficult to find what the user
requires as they will have to go through the whole image,
reading each line and word to determine if it is relevant to
their need. Images also take up morespacethanwordfilesin
the computer. It is essential to be able to store this
information in such a way so that it becomeseasiertosearch
and edit the data. There is a growing demand for
applications that can recognize characters from scanned
documents or captured images and make them editable and
easily accessible.
Artificial intelligence is an area of computer science where a
machine is trained to think and behave like intelligent
human beings. Optical Character Recognition (OCR) is a
branch of artificial intelligence. It is used to detect and
extract characters from scanned documents or images and
convert them to editable form. Earlier methods of OCR used
convolutional neural networks but theyarecomplicatedand
usually best suitable for single characters. These methods
also had a higher error rate. Tesseract OCR Engine makes
use of Long Short Term Memory (LSTM) which is a part of
Recurrent Neural Networks. It is open-source and is more
suitable for handwritten texts. It is also suitable at
recognizing larger portion of text data instead of single
characters. Tesseract OCR Engine significantly reduces
errors created in the process of character recognition.
Tesseract assumes that the input image is a binary image
and processing takes place step-by-step. The first step is to
recognize connected components. Outlines are nested into
blobs. These blobs are organized into text lines. Text lines
are broken according to the pitching. If there is a fixed pitch
between the characters then recognition of text takes place
which is a two-pass process.
An adaptive classifier is used here. Words that are
recognized in the first pass are given to the classifier so that
it can learn from the data and use that information for the
second pass to recognize the words that were left out in the
previous pass. Words that are joined arechoppedand words
that are broken are recognized with the help of A* algorithm
that maintains a priority queue which contains the best
suitable characters. Then, the usercanstorethisinformation
in their computers by saving them in word documents or
notepads that can be edited any time they want.
It is difficult for visually impaired people to read textual
information. Blind people havetomakeuseofBrailletoread.
It would be easier for them to simply listen to theaudioform
of the data. This application can be used to convert textual
data to audio format so that it is easier for people to hear the
information. Google Text-To-Speech API is used to convert
the text information into audio form.
2. EXISTING SYSTEM
There are various methods for OCR. Some of them are:
2.1 Connected components based method
It is a well known method used for text detection from
images. The connected components are extracted with help
of algorithm. The resulting components are thenpartitioned
into clusters. This approach detects pixel differences
between the text and the backgroundofthetextimage.It can
extract and recognize the characters too.
2.2 Sliding window based method
This method is also known as text binarisation process. It
classifies individual pixels as text or background in the
textual images. The method acts as bridge between
localization and recognization by OCR.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 517
Fig -1: Sliding window based method
2.3 Hybrid method
Hybrid method is used for text classification. This approach
detects and recognizes texts in CAPTCHA images. The
strength of CAPTCHAcanbe checked.Thismethod efficiently
detects and recognizes the text with a low false positive.
Fig -2: Captcha
2.4 Edge based method
This method is also known as image processing technique,
which finds boundaries of the images or any other objects
within the images. It works by detecting discontinuities in
brightness. This approach is also used for image
segmentation and data extraction in areas such as image
processing, machine vision and computer vision.
Fig -3: Edge-based method
2.5 Color based method
Color based approach is used for clustering. It consists of
two phases: text detection phase and text extraction phase.
In text detection phase, two features are considered –
homogeneous color and sharp edges, and color based
clustering is used to decompose color edge map of image to
several edge maps, which makes text detection more
accurate. In text extractionphase,thedifference betweenthe
text and background in image is considered.
2.6 Texture based method
It is another approach used for detecting texts in images.
This approach uses Support Vector Machine (SVM) to
analyze the textural properties of texts. This method also
uses continuously adaptive mean shift algorithm
(CAMSHIFT) that results in texture analysis. It combines
both SVM and CAMSHIFT to provide robustand efficienttext
detection.
2.7 Corner based method
Corner based method is used for text extraction method. It
has three stages - a) Computing corner response in multi-
scale space and thresholding it to get the candidate region of
text; b)verifying candidate region by combining color and
size range features and; c) locating the text line using
bounding box. It is two-dimensional feature point whichhas
high curvature in region boundary.
2.8 Stroke based method
This approach is used to detect and recognize text from the
video. It uses text confidence using an edge orientation
variance and oppositeedgepairfeature.Thecomponents are
extracted and grouped into text lines based on text
confidence maps. It can detect multilingual texts in video
with high accuracy.
Fig -4: Stroke-based method
2.9 Semi automatic ground truth generation method
This approach is also used for detecting and recognizingtext
from the videos. It can detect English and Chinese text of
different orientations. It has attributes like: line index, word
index, script type, area, content, type of text and many more.
It is most efficient method to detect texts from the videos.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 518
3. LITERATURE REVIEW
Text detection from images is useful in many real world
applications. The data that is stored in text is huge and there
is need to store this data in such a manner that can be
searched easily whenever required. Eliminationoftheuseof
paper is one of the steps to progress towards a world of
electronics. Also, data that can be converted to audio form is
a way to ease the lives of visually impaired people.
In [2007] Ray Smith published anoverviewofTesseractOCR
Engine. It stated that Tesseract started as a PhD project
sponsored by HP in 1984. In 1987, a second person was
assigned to help build it better. In 1988, HPLabs joint with
Scanner Division project. In 1990, the scanner product was
cancelled and fouryearsaheadHPLabsprojectwascancelled
too. From the year 1995 till the year 2005, Tesseract was in
its dark ages. But in the year 2005, it was open sourced by
HP. In 2006, Google took over it. In 2008, Tesseract
expanded to support six languages. By the year 2016, it was
developed further to makes use of LSTM for the purpose of
OCR.
In [2015] Pratik Madhukar Manwatkar and Dr. Kavita R.
Singh published a technical review on text recognition from
images. It emphasizes on the growing demand of OCR
applications as it necessary in today’s world to store
information digitally so that it can be edited whenever
required. This information can later be searched easily as it
is in digital format. The system takes image as an input,
processes on the image and the output is in the form of
textual data.
In [2016] Akhilesh A. Panchal, Shrugal Varde and M.S. Panse
proposed a character detection and recognition system for
visually impaired people. Theyfocusedon the needof people
that are visually impaired as it is difficult for them to read
text data. This system can be used to extract text data from
shop boards or direction boardsandconveythisinformation
to the user in audio form. The main challenges are the
different fonts of the texts on the natural scene images.
In [2017] Nada Farhani, Naim Terbeh and Mounir Zrigui
published a paper that stated the conversion of different
modalities. Human beings have different modalities such as
gesture, sound, touch and images. It is vital that they can
convert the information between these modalities. The
paper focuses on conversion of image to text and also on
text-to-speech so that the user can hear the information
whenever required.
In [2017] Azmi Can Özgen, Mandana Fasounaki and Hazım
Kemal Ekenel published a paper that stated how text data
can be extracted from both natural scene images and
computer generated images. They make use of Maximally
Stable Extermal Regions for the purpose of text detection
and recognition. This method eliminatedthenon-textpartof
the image so that OCR can be done more efficiently.
In [2018] Sandeep Musale and Vikram Ghiye proposed a
system that is a smart reader for visually impaired people.
Using this system, they can convert the text information to
audio format. This system has an audio interface that the
people with visual problems can use easily. It uses a
combination of OTSU and Canny algorithms for the purpose
for character recognition.
In [2018] Christian Reul, Uwe Springmann, Christoph Wick
and Frank Puppe proposed a method to reduce the errors
generated during OCR process. It includescrossfoldtraining
and voting to recognize the words more accurately.AsLSTM
is introduced now, it is easier to recognize words of old
printed books, handwritten words, blurry or uneven words
with high accuracy. A combination of ground truths and
confidence values are used in this method for optimal
recognition of the characters.
4. CONCLUSION AND FUTURE SCOPE
In this age of technology, there is a huge amount of data and
it keeps on increasing day by day. Even though much of the
data is digital, people still prefer to make use of written
transcripts. However, it is necessary to store this data in
digital format in computers so that it can be accessed and
edited easily by the user. This system can be used for
character recognition from scanned documents so that data
can be digitalized. Also, the data can be converted to audio
form so as to help visually impaired people obtain the data
easily.
In the future, we can expand the system to that is can
recognize more languages, different fonts and also
handwritten notes. Various accents can also be added for
audio data.
REFERENCES
[1] Ray Smith, “An overview of the tesseract OCR engine,”
2005.
[2] Pratik Madhukar Manwatkar, Dr. Kavita R. Singh, “A
technical review on text recognition from images,” IEEE
Sponsored 9th International Conference on Intelligent
Systems and Control (ISCO), 2015.
[3] Akhilesh A. Panchal, Shrugal Varde, M.S. Panse,
“Character detection and recognitionsystemforvisually
impaired people,” IEEE International Conference On
Recent Trends In Electronics Information
Communication Technology, May 20-21, 2016.
[4] Nada Farhani, Naim Terbeh, Mounir Zrigui, “Image to
text conversion: state of the art and extended work,”
IEEE/ACS 14th International Conference on Computer
Systems and Applications, 2017.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 519
[5] Azmi Can Özgen, Mandana Fasounaki, Hazım Kemal
Ekenel, “Text detection in natural and computer-
generated images,” 2017.
[6] Sandeep Musale, Vikram Ghiye, “Smart reader for
visually impaired,” Proceedings of the Second
International Conference on Inventive Systems and
Control (ICISC 2018) IEEE Xplore Compliant - Part
Number:CFP18J06-ART,ISBN:978-1-5386-0807-4;DVD
Part Number:CFP18J06DVD, ISBN:978-1-5386-0806-7.
[7] Christian Reul, Uwe Springmann, ChristophWick,Frank
Puppe, “Improving OCR accuracy onearlyprintedbooks
by utilizing cross fold training and voting,” 13th IAPR
International Workshopon DocumentAnalysisSystems,
2018.
[8] U. Springmann and A. Ludeling, “OCR of historical ¨
printings with an application to building diachronic
corpora: A case study using the RIDGES herbal corpus,”
Digital Humanities Quarterly, vol. 11, no. 2, 2017.
[9] J. C. Handley, “Improving OCR accuracy through
combination: A survey,” in Systems, Man, and
Cybernetics, IEEE, 1998.
[10] F. Boschetti, M. Romanello, A. Babeu, D. Bamman, and G.
Crane, “Improving OCR accuracy for classical critical
editions,” ResearchandAdvancedTechnologyforDigital
Libraries, pp. 156–167, 2009.
[11] Vinyals, Oriol, Toshev, Alexander, Bengio, Samy, and
Erhan, Dumitru, ”Show and tell: A neural image caption
generator”, In CVPR, 2015.
[12] Rafeal C. Ginzalez and Richard E. Woods, “Digital Image
Processing”, Pearson Education, Second Edtition, 2005.

More Related Content

What's hot

Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Chiranjeevi Adi
 
OCR speech using Labview
OCR speech using LabviewOCR speech using Labview
OCR speech using LabviewBharat Thakur
 
Automatic handwriting recognition
Automatic handwriting recognitionAutomatic handwriting recognition
Automatic handwriting recognitionBIJIT GHOSH
 
IRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for BlindIRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for BlindIRJET Journal
 
Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalBiniam Asnake
 
Optical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyOptical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyEr. Ashish Pandey
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) Systemiosrjce
 
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESA STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESijcsitcejournal
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionNaiyan Noor
 
Handwritten character recognition in
Handwritten character recognition inHandwritten character recognition in
Handwritten character recognition inijaia
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Vidyut Singhania
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character RecognitionRahul Mallik
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting RecognitionBindu Karki
 
A SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITION
A SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITIONA SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITION
A SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITIONIJCIRAS Journal
 
Rule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognitionRule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognitionRanda Elanwar
 
OCR (Optical Character Recognition)
OCR (Optical Character Recognition) OCR (Optical Character Recognition)
OCR (Optical Character Recognition) IstiaqueBinIslam
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...iosrjce
 
A Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition SystemA Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
 

What's hot (20)

Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
OCR speech using Labview
OCR speech using LabviewOCR speech using Labview
OCR speech using Labview
 
Automatic handwriting recognition
Automatic handwriting recognitionAutomatic handwriting recognition
Automatic handwriting recognition
 
OCR Text Extraction
OCR Text ExtractionOCR Text Extraction
OCR Text Extraction
 
IRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for BlindIRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for Blind
 
Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based Retrieval
 
Optical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyOptical character recognition IEEE Paper Study
Optical character recognition IEEE Paper Study
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) System
 
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESA STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
 
Handwritten character recognition in
Handwritten character recognition inHandwritten character recognition in
Handwritten character recognition in
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting Recognition
 
A SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITION
A SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITIONA SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITION
A SURVEY ON DEEP LEARNING METHOD USED FOR CHARACTER RECOGNITION
 
Handwritten Character Recognition
Handwritten Character RecognitionHandwritten Character Recognition
Handwritten Character Recognition
 
Rule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognitionRule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognition
 
OCR (Optical Character Recognition)
OCR (Optical Character Recognition) OCR (Optical Character Recognition)
OCR (Optical Character Recognition)
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
 
A Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition SystemA Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition System
 

Similar to IRJET- Image to Text Conversion using Tesseract

Product Label Reading System for visually challenged people
Product Label Reading System for visually challenged peopleProduct Label Reading System for visually challenged people
Product Label Reading System for visually challenged peopleIRJET Journal
 
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCAREOPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCAREIRJET Journal
 
IRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET Journal
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PIijtsrd
 
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET Journal
 
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry PiIRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry PiIRJET Journal
 
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUIIRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUIIRJET Journal
 
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
 
Handwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNNHandwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNNIRJET Journal
 
IRJET-MText Extraction from Images using Convolutional Neural Network
IRJET-MText Extraction from Images using Convolutional Neural NetworkIRJET-MText Extraction from Images using Convolutional Neural Network
IRJET-MText Extraction from Images using Convolutional Neural NetworkIRJET Journal
 
IRJET- Survey Paper: Image Reader for Blind Person
IRJET- Survey Paper: Image Reader for Blind PersonIRJET- Survey Paper: Image Reader for Blind Person
IRJET- Survey Paper: Image Reader for Blind PersonIRJET Journal
 
Design and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontDesign and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontIRJET Journal
 
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNINGIMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNINGIRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET Journal
 
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A ReviewText Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A ReviewIRJET Journal
 
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...Editor IJMTER
 
A Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingA Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingIRJET Journal
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AIIRJET Journal
 
IRJET- Sign Language Interpreter
IRJET- Sign Language InterpreterIRJET- Sign Language Interpreter
IRJET- Sign Language InterpreterIRJET Journal
 
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...IRJET Journal
 

Similar to IRJET- Image to Text Conversion using Tesseract (20)

Product Label Reading System for visually challenged people
Product Label Reading System for visually challenged peopleProduct Label Reading System for visually challenged people
Product Label Reading System for visually challenged people
 
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCAREOPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
 
IRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using Android
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PI
 
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
 
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry PiIRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
 
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUIIRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
 
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
 
Handwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNNHandwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNN
 
IRJET-MText Extraction from Images using Convolutional Neural Network
IRJET-MText Extraction from Images using Convolutional Neural NetworkIRJET-MText Extraction from Images using Convolutional Neural Network
IRJET-MText Extraction from Images using Convolutional Neural Network
 
IRJET- Survey Paper: Image Reader for Blind Person
IRJET- Survey Paper: Image Reader for Blind PersonIRJET- Survey Paper: Image Reader for Blind Person
IRJET- Survey Paper: Image Reader for Blind Person
 
Design and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontDesign and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English Font
 
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNINGIMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
 
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A ReviewText Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
 
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
 
A Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingA Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive Handwriting
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AI
 
IRJET- Sign Language Interpreter
IRJET- Sign Language InterpreterIRJET- Sign Language Interpreter
IRJET- Sign Language Interpreter
 
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxVishalDeshpande27
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-IVigneshvaranMech
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfPipe Restoration Solutions
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGKOUSTAV SARKAR
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturingssuser0811ec
 
retail automation billing system ppt.pptx
retail automation billing system ppt.pptxretail automation billing system ppt.pptx
retail automation billing system ppt.pptxfaamieahmd
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringDr. Radhey Shyam
 
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...Amil baba
 
Electrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineElectrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineJulioCesarSalazarHer1
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxMd. Shahidul Islam Prodhan
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxwendy cai
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industriesMuhammadTufail242431
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdfAhmedHussein950959
 
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdfONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdfKamal Acharya
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfKamal Acharya
 

Recently uploaded (20)

shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptx
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturing
 
retail automation billing system ppt.pptx
retail automation billing system ppt.pptxretail automation billing system ppt.pptx
retail automation billing system ppt.pptx
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
 
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
 
Electrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineElectrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission line
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdfONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdf
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 

IRJET- Image to Text Conversion using Tesseract

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 516 Image to Text Conversion Using Tesseract Nisha Pawar1, Zainab Shaikh2, Poonam Shinde3, Prof. Y.P. Warke4 1,2,3,4Dept. of Computer Engineering, Marathwada Mitra Mandal’s Institute of Technology, Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - In the current world, there isagreatincreaseinthe utilization of digital technology and various methods are available for the people to capture images. Such images may contain important textual data that the user may need to edit or store digitally. This can be done using Optical Character Recognition with the help of Tesseract OCR Engine. OCR is a branch of artificial intelligence that is used in applications to recognize text from scanned documents or images. The recognized text can also be converted to audio format to help visually impaired people hear the information that they wish to know. Key Words: Artificial Intelligence, Optical Character Recognition, Tesseract, text-to-speech 1. INTRODUCTION Textual information is available in many resources such as documents, newspapers, faxes, printed information, written notes, etc. Many people simply scan the document to store the data in the computers. When a documentisscannedwith a scanner, it is stored in the form of images. Buttheseimages are not editable and it is very difficult to find what the user requires as they will have to go through the whole image, reading each line and word to determine if it is relevant to their need. Images also take up morespacethanwordfilesin the computer. It is essential to be able to store this information in such a way so that it becomeseasiertosearch and edit the data. There is a growing demand for applications that can recognize characters from scanned documents or captured images and make them editable and easily accessible. Artificial intelligence is an area of computer science where a machine is trained to think and behave like intelligent human beings. Optical Character Recognition (OCR) is a branch of artificial intelligence. It is used to detect and extract characters from scanned documents or images and convert them to editable form. Earlier methods of OCR used convolutional neural networks but theyarecomplicatedand usually best suitable for single characters. These methods also had a higher error rate. Tesseract OCR Engine makes use of Long Short Term Memory (LSTM) which is a part of Recurrent Neural Networks. It is open-source and is more suitable for handwritten texts. It is also suitable at recognizing larger portion of text data instead of single characters. Tesseract OCR Engine significantly reduces errors created in the process of character recognition. Tesseract assumes that the input image is a binary image and processing takes place step-by-step. The first step is to recognize connected components. Outlines are nested into blobs. These blobs are organized into text lines. Text lines are broken according to the pitching. If there is a fixed pitch between the characters then recognition of text takes place which is a two-pass process. An adaptive classifier is used here. Words that are recognized in the first pass are given to the classifier so that it can learn from the data and use that information for the second pass to recognize the words that were left out in the previous pass. Words that are joined arechoppedand words that are broken are recognized with the help of A* algorithm that maintains a priority queue which contains the best suitable characters. Then, the usercanstorethisinformation in their computers by saving them in word documents or notepads that can be edited any time they want. It is difficult for visually impaired people to read textual information. Blind people havetomakeuseofBrailletoread. It would be easier for them to simply listen to theaudioform of the data. This application can be used to convert textual data to audio format so that it is easier for people to hear the information. Google Text-To-Speech API is used to convert the text information into audio form. 2. EXISTING SYSTEM There are various methods for OCR. Some of them are: 2.1 Connected components based method It is a well known method used for text detection from images. The connected components are extracted with help of algorithm. The resulting components are thenpartitioned into clusters. This approach detects pixel differences between the text and the backgroundofthetextimage.It can extract and recognize the characters too. 2.2 Sliding window based method This method is also known as text binarisation process. It classifies individual pixels as text or background in the textual images. The method acts as bridge between localization and recognization by OCR.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 517 Fig -1: Sliding window based method 2.3 Hybrid method Hybrid method is used for text classification. This approach detects and recognizes texts in CAPTCHA images. The strength of CAPTCHAcanbe checked.Thismethod efficiently detects and recognizes the text with a low false positive. Fig -2: Captcha 2.4 Edge based method This method is also known as image processing technique, which finds boundaries of the images or any other objects within the images. It works by detecting discontinuities in brightness. This approach is also used for image segmentation and data extraction in areas such as image processing, machine vision and computer vision. Fig -3: Edge-based method 2.5 Color based method Color based approach is used for clustering. It consists of two phases: text detection phase and text extraction phase. In text detection phase, two features are considered – homogeneous color and sharp edges, and color based clustering is used to decompose color edge map of image to several edge maps, which makes text detection more accurate. In text extractionphase,thedifference betweenthe text and background in image is considered. 2.6 Texture based method It is another approach used for detecting texts in images. This approach uses Support Vector Machine (SVM) to analyze the textural properties of texts. This method also uses continuously adaptive mean shift algorithm (CAMSHIFT) that results in texture analysis. It combines both SVM and CAMSHIFT to provide robustand efficienttext detection. 2.7 Corner based method Corner based method is used for text extraction method. It has three stages - a) Computing corner response in multi- scale space and thresholding it to get the candidate region of text; b)verifying candidate region by combining color and size range features and; c) locating the text line using bounding box. It is two-dimensional feature point whichhas high curvature in region boundary. 2.8 Stroke based method This approach is used to detect and recognize text from the video. It uses text confidence using an edge orientation variance and oppositeedgepairfeature.Thecomponents are extracted and grouped into text lines based on text confidence maps. It can detect multilingual texts in video with high accuracy. Fig -4: Stroke-based method 2.9 Semi automatic ground truth generation method This approach is also used for detecting and recognizingtext from the videos. It can detect English and Chinese text of different orientations. It has attributes like: line index, word index, script type, area, content, type of text and many more. It is most efficient method to detect texts from the videos.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 518 3. LITERATURE REVIEW Text detection from images is useful in many real world applications. The data that is stored in text is huge and there is need to store this data in such a manner that can be searched easily whenever required. Eliminationoftheuseof paper is one of the steps to progress towards a world of electronics. Also, data that can be converted to audio form is a way to ease the lives of visually impaired people. In [2007] Ray Smith published anoverviewofTesseractOCR Engine. It stated that Tesseract started as a PhD project sponsored by HP in 1984. In 1987, a second person was assigned to help build it better. In 1988, HPLabs joint with Scanner Division project. In 1990, the scanner product was cancelled and fouryearsaheadHPLabsprojectwascancelled too. From the year 1995 till the year 2005, Tesseract was in its dark ages. But in the year 2005, it was open sourced by HP. In 2006, Google took over it. In 2008, Tesseract expanded to support six languages. By the year 2016, it was developed further to makes use of LSTM for the purpose of OCR. In [2015] Pratik Madhukar Manwatkar and Dr. Kavita R. Singh published a technical review on text recognition from images. It emphasizes on the growing demand of OCR applications as it necessary in today’s world to store information digitally so that it can be edited whenever required. This information can later be searched easily as it is in digital format. The system takes image as an input, processes on the image and the output is in the form of textual data. In [2016] Akhilesh A. Panchal, Shrugal Varde and M.S. Panse proposed a character detection and recognition system for visually impaired people. Theyfocusedon the needof people that are visually impaired as it is difficult for them to read text data. This system can be used to extract text data from shop boards or direction boardsandconveythisinformation to the user in audio form. The main challenges are the different fonts of the texts on the natural scene images. In [2017] Nada Farhani, Naim Terbeh and Mounir Zrigui published a paper that stated the conversion of different modalities. Human beings have different modalities such as gesture, sound, touch and images. It is vital that they can convert the information between these modalities. The paper focuses on conversion of image to text and also on text-to-speech so that the user can hear the information whenever required. In [2017] Azmi Can Özgen, Mandana Fasounaki and Hazım Kemal Ekenel published a paper that stated how text data can be extracted from both natural scene images and computer generated images. They make use of Maximally Stable Extermal Regions for the purpose of text detection and recognition. This method eliminatedthenon-textpartof the image so that OCR can be done more efficiently. In [2018] Sandeep Musale and Vikram Ghiye proposed a system that is a smart reader for visually impaired people. Using this system, they can convert the text information to audio format. This system has an audio interface that the people with visual problems can use easily. It uses a combination of OTSU and Canny algorithms for the purpose for character recognition. In [2018] Christian Reul, Uwe Springmann, Christoph Wick and Frank Puppe proposed a method to reduce the errors generated during OCR process. It includescrossfoldtraining and voting to recognize the words more accurately.AsLSTM is introduced now, it is easier to recognize words of old printed books, handwritten words, blurry or uneven words with high accuracy. A combination of ground truths and confidence values are used in this method for optimal recognition of the characters. 4. CONCLUSION AND FUTURE SCOPE In this age of technology, there is a huge amount of data and it keeps on increasing day by day. Even though much of the data is digital, people still prefer to make use of written transcripts. However, it is necessary to store this data in digital format in computers so that it can be accessed and edited easily by the user. This system can be used for character recognition from scanned documents so that data can be digitalized. Also, the data can be converted to audio form so as to help visually impaired people obtain the data easily. In the future, we can expand the system to that is can recognize more languages, different fonts and also handwritten notes. Various accents can also be added for audio data. REFERENCES [1] Ray Smith, “An overview of the tesseract OCR engine,” 2005. [2] Pratik Madhukar Manwatkar, Dr. Kavita R. Singh, “A technical review on text recognition from images,” IEEE Sponsored 9th International Conference on Intelligent Systems and Control (ISCO), 2015. [3] Akhilesh A. Panchal, Shrugal Varde, M.S. Panse, “Character detection and recognitionsystemforvisually impaired people,” IEEE International Conference On Recent Trends In Electronics Information Communication Technology, May 20-21, 2016. [4] Nada Farhani, Naim Terbeh, Mounir Zrigui, “Image to text conversion: state of the art and extended work,” IEEE/ACS 14th International Conference on Computer Systems and Applications, 2017.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 519 [5] Azmi Can Özgen, Mandana Fasounaki, Hazım Kemal Ekenel, “Text detection in natural and computer- generated images,” 2017. [6] Sandeep Musale, Vikram Ghiye, “Smart reader for visually impaired,” Proceedings of the Second International Conference on Inventive Systems and Control (ICISC 2018) IEEE Xplore Compliant - Part Number:CFP18J06-ART,ISBN:978-1-5386-0807-4;DVD Part Number:CFP18J06DVD, ISBN:978-1-5386-0806-7. [7] Christian Reul, Uwe Springmann, ChristophWick,Frank Puppe, “Improving OCR accuracy onearlyprintedbooks by utilizing cross fold training and voting,” 13th IAPR International Workshopon DocumentAnalysisSystems, 2018. [8] U. Springmann and A. Ludeling, “OCR of historical ¨ printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus,” Digital Humanities Quarterly, vol. 11, no. 2, 2017. [9] J. C. Handley, “Improving OCR accuracy through combination: A survey,” in Systems, Man, and Cybernetics, IEEE, 1998. [10] F. Boschetti, M. Romanello, A. Babeu, D. Bamman, and G. Crane, “Improving OCR accuracy for classical critical editions,” ResearchandAdvancedTechnologyforDigital Libraries, pp. 156–167, 2009. [11] Vinyals, Oriol, Toshev, Alexander, Bengio, Samy, and Erhan, Dumitru, ”Show and tell: A neural image caption generator”, In CVPR, 2015. [12] Rafeal C. Ginzalez and Richard E. Woods, “Digital Image Processing”, Pearson Education, Second Edtition, 2005.