This document summarizes a student project on OCR-based image text-to-speech conversion. It introduces the topic, describes using OCR to extract text from images which is then converted to speech. It includes an abstract, details of related works, and concludes that the proposed system allows visually impaired users to hear text extracted from images efficiently using Tesseract OCR and text-to-speech conversion.
1. OCR BASED IMAGE TEXT TO SPEECH CONVERSION
Presentation on
of
Bachelor Of Engineering
in
Computer Science and Engineering(Data Science)
By
N. Niranjan Reddy P. Indraja
M. Bharadwaja T. Raju Reddy
Under the esteemed guidance
of
T.Anusha , Assistant Professor
TKR COLLEGE OF ENGINEERING & TECHNOLOGY(AUTONOMOUS)
Meerpet,near LB Nagar,Hyderabad,Telangana,India
2022-2023
2. Introduction
๏ In the recent era, visual text in natural or manmade scenes might carry very important and useful
information. Therefore, the scientists have started to digitize these images, extract and interpret the
data by using specific techniques, and then perform text-to-speech synthesis (TTS).
๏ Optical character recognition is employed to recognize and extract the words and finally the extracted
text is converted to appropriate speech using text-to-speech synthesizer.
๏ Text-to-speech (TTS) conversion is the process of converting written text into spoken words using
computer software.
๏ This technology is used to create synthetic voices that can read text aloud, making it accessible to
individuals who are visually impaired, have reading difficulties, or prefer to listen to information
rather than read it.
๏ The process involves analyzing the text and applying natural language processing algorithms to
determine pronunciation, intonation, and emphasis.
๏ The resulting audio output can be customized by adjusting the speed, pitch, and other parameters to
match the user's preferences
3. ABSTARCT
๏ In the current world, there is a great increase in the utilization of digital technology and various methods are
available for the people to capture images.
๏ Such images may contain important textual data that the user may need to edit or store digitally.
๏ Manual entry of data is time taking and may contain errors.
๏ There are millions of blind people in the world who are visually impaired.
๏ Disability to read has a large impact on the life of visually impaired people.
๏ The Proposed system is cost-efficient and helps the visually impaired person to hear the text.
๏ The main idea of this project is optical Character recognition which is used to convert text character into the audio
signal.
4. BASE PAPER
TITLE : OCR Based Image Text to Speech Conversion Using MATLAB
DETAILS : Received , June 14,2018, accepted June 15,2018, date of publication March 10, 2019.
๏ The main idea of this project is optical Character recognition which is used to convert text character into the audio
signal.
๏ The text is preprocessed and then used for recognition by segmenting each character.
๏ Segmentation is followed by extraction of the letter and resizing of the file containing the text.
๏ This Text file is then converted into the audio signal.
๏ LINK: OCR Based Image Text to Speech Conversion Using MATLAB | IEEE Conference Publication | IEEE
Xplore
5. Reference - I
TITLE : Image text to speech conversion in the desired language by translating with Raspberry
Pi
DETAILS : Received December 15, 2016, accepted December 16, 2016, date of publication May 8, 2017, date of
current version may19, 2017
๏ This paper is based on a prototype which helps user to hear the contents of the text images in the desired language.
๏ It involves extraction of text from the image and converting the text to translated speech in the user desired language.
๏ This is done with Raspberry Pi and a camera module by using the concepts of Tesseract OCR [optical character
recognition] engine, Google Speech API [application program interface] which is the Text to speech engine and the
Microsoft translator.
๏ LINK : Image text to speech conversion in the desired language by translating with Raspberry Pi | IEEE Conference
Publication | IEEE Xplore
6. Reference - II
TITLE : Image to Text Conversion Using Tesseract
DETAILS : Received 19 Feb 2019, accepted 23 feb 2019, date of publication 03 March 2019, date of
current version 18 March 2019.
๏ Textual information is available in many resources such as documents, newspapers, faxes, printed information, written
notes, etc.
๏ Many people simply scan the document to store the data in the computers.
๏ When a document is scanned with a scanner, it is stored in the form of images.
๏ But these images are not editable and it is very difficult to find what the user requires as they will have to go through the
whole image, reading each line and word to determine if it is relevant to their need.
๏ LINK : IRJET-V6I299.pdf
7. Reference - III
TITLE : Detecting text based image with optical character recognition for English
translation and speech using Android
DETAILS : Received 15 December 2023, accepted 16 December 2023, date of publication 30 April 2016,
date of current version 7 May 2016.
๏ In this study, an Android application is developed by integrating Tesseract OCR engine, Bing translator and phones'
built-in speech out technology.
๏ Final deliverable is tested by various type of target end user from a different language background and concluded that
the application benefits many users.
๏ LINK :Detecting text based image with optical character recognition for English translation and
speech using Android | IEEE Conference Publication | IEEE Xplore
8. CONCLUSION
๏ This project will extract text from image or video using tesseract OCR. With tesseract OCR , users can
extract text from images with efficient in-line and character pattern recognition of OCR engine. The
extracted text will be displayed to the user in editable format.
๏ Further pyttsx3 is used to convert text to audio format with different accents and male and female
voices. Tkinter , a python module is used to create a graphical user interface for the project which is
fast, efficient and easy to use.
Editor's Notes
Insert a map of your country.
Insert a picture of one of the geographic features of your country.
Insert a picture illustrating a season in your country.
Insert a picture of an animal and or plant found in your country.
Insert a picture of an animal and or plant found in your country.
Insert a picture of an animal and or plant found in your country.
Insert a picture of one of the geographic features of your country.