2. Proceedings of the International Conference on Emerging Trends in Engineering and Management (ICETEM14)
30 – 31, December 2014, Ernakulam, India
175
2. DESCRIPTION OF THE PROPOSED SYSTEM
Fig.1 shows the basic block diagram of device to convert text to speech. The image of text is captured by
camera using image acquisition. The contrast adjustment is done using image enhancement technique. Filtering is done
for noise reduction. The edges in the image is determined with the help of edge detection methods, hence finding the
boundaries. Cropping is done here. The text present in the image are segmented into separate letters & extracted letters
is compared with the letters early stored in the system for character recognition. We use correlation matching technique
for the purpose. The corresponding letter is played. Here the letters obtained are separated to a words. We set a threshold
value for space, if value obtained is greater than threshold value it is considered as letter else space &thus separation of
words take place. Text-to-speech (TTS) synthesizer would start with the words in the text, convert each word one-by-one
into speech, & concatenate the result together. Thus the voice is produced from a text.
S SPEAKER
Figure 1: Block diagram for text to speech production
Fig.2 shows the proposed block diagram of device to convert text to Braille script. The image of text is captured
by camera using image acquisition. The contrast adjustment is done using image enhancement technique. Filtering is
done for noise reduction. The edges in the image is determined with the help of edge detection methods, hence finding
the boundaries. Cropping is done here. The text present in the image are segmented into separate letters & extracted
letters is compared with the letters early stored in the system for character recognition. We use correlation matching
technique for the purpose. The corresponding letter is played. Here the letters obtained are separated to a words. We set a
threshold value for space, if value obtained is greater than threshold value it is considered as letter else space &thus
separation of words take place. Characters are sent to the
Graphical User Interface (GUI) on the PC. The American Standard Code for Information Interchange (ASCII)
value of the character to be read can be sent wirelessly from PC to Microcontroller using the wireless CC 2500 Radio
Frequency (RF) Transreceiver module. The American Standard Code for Information Interchange (ASCII) value of the
character sent from the PC can be converted to the corresponding Braille code using a conversion algorithm. This
conversion program can be written in an Embedded C language and it can be recorded in microcontroller. The output of
the microcontroller can be taken from the general purpose input/output pins of the development board in the form of
voltages that is either 0 Volts or 5 Volts.
A six bit number in binary/hexadecimal form can be obtained from the output of the microcontroller
corresponding to the Braille code of the character. The output from the six Input/output pins can be further given to the
tactile display made of six solenoids that represent the Braille characters, the device will be having only a single Braille
cell. The touchpad can be interfaced to the device so that the user can navigate through the textbooks using gestures like
forward stroke, backward stroke, up or down movements.
CAMERA
IMAGE
ACQUISITION
IMAGE
ENHANCEMENT
FILTERING
EDGE
DETECTION
CHARACTER
SEGMENTATION
CHARACTER
RECOGNITION
SEPARATION
OF WORDS
TEXT TO
SPEECH
CONVERSION
3. Proceedings of the International Conference on Emerging Trends in Engineering and Management (ICETEM14)
30 – 31, December 2014, Ernakulam, India
176
S
Figure 2: Block diagram for text to braille script
2.1. Camera
The camera here we use is a normal webcam which is of low cost. The advantage of using a webcam is that it
can be interfaces very easily and is able to take pictures real time. However it is preferred to use camera of better
resolution for better results.
2.2. Image acquisition
Matlab has image acquisition toolbox for getting image signals from a video device. For image capture, the
device configured must have a supporting adaptor & should be compatible with system resolution and colour patterns. A
video object is initialized here & the images are captured at desired intervals after setting required parameters.
2.3. Image Enhancement
This is improvement of digital image quality. Contrast adjustment is made by histrogram acquisition. Histeq is
the command used to do histrogram acquisition. Grayscale image only works.
2.4. Filtering
The technique of median filtering is used. A median filter operates over window by selecting the median
intensity in the window. Median filter is an example of Non-linear filtering, often used to remove noise. Median filtering
is very widely used in digital image processing because under certain conditions, it preserves edges while removing
noise.
2.5. Edge Detection
This is the image processing step in Matlab. At first the edges in the image is determined with the help of edge
detection methods, hence finding the boundaries. Cropping is done here. Performs a contrast enhancement if needed. The
image is then resized.
CAMERA IMAGE
ACQUISITION
IMAGE
ENHANCEMENT
FILTERING
EDGE
DETECTION
CHARACTER
SEGMENTATION
CHARACTER
RECOGNITION
GUI ON
PC
CC 2500
TRANSRECEIVER
MODULE
ASCII TO
BRAILLE
CONVERSION
ALGORITHM
MICROCONTROLLER
SOLENOIDS
TOUCHPAD
SEPARATION
OF
WORDS
4. Proceedings of the International Conference on Emerging Trends in Engineering and Management (ICETEM14)
30 – 31, December 2014, Ernakulam, India
177
2.6. Character Segmentation
Partition of image into several components. Segmentation is an important part of practically any automated
image recognition system, because it is at this moment that one extracts the interesting objects, for further processing
such as description or recognition. Segmentation of an image is in practice the classification of each image pixel to one
of image parts.
2.7. Character Recognition
The captured feature extracted image is compared with the images early stored in the system for character
recognition. We use correlation matching technique for the purpose. The corresponding letter is played.
2.8. Separation of words
Here the letters obtained are separated to a words. We set a threshold value for space, if value obtained is greater
than threshold value it is considered as letter else space and thus separation of words takes place.
2.9. Text to Speech Conversion
Text-to-speech (TTS) synthesizer would start with the words in the text , convert each word one-by-one into
speech and concatenate the result together. The task of a TTS System is thus a complex one that involves mimicking
what human readers do. Windows Speech Application Program Interface is used here.
3. SOFTWARE IMPLEMENTATION
The whole system is implemented in Matlab environment. Image quality should be considerably well to obtain
efficient output. Text-to-speech synthesizer (TTS) would start with the words in the text, convert each word one-by-one
into speech and concatenate the result together. The task of a TTS system is thus a complex one that involves mimicking
what human readers do. Windows Speech Application Program Interface is used here. The Speech Application
Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech
synthesis within Windows applications. It is possible for a 3rd
-party company to produce their own Speech Recognition
& Text-To-Speech engines or adapt existing engines to work with SAPI. Here we use default sampling frequency 16000.
Speed can be set between -10 to +10. Normal speed is zero. Thus the text can be converted to speech. The proposed
system of converting text to Braille script can be doned by using GUI.
3.1. Simulation Windows
Figure 3: window for to select the mode
5. Proceedings of the International Conference on Emerging Trends in Engineering and Management (ICETEM14)
30 – 31, December 2014, Ernakulam, India
178
Figure 4: window to get preview of image
Figure 5: window to capture image
Figure 6: window to process image
Here image captured will be processed. The text is converted to speech by TTS synthesizer.
6. Proceedings of the International Conference on Emerging Trends in Engineering and Management (ICETEM14)
30 – 31, December 2014, Ernakulam, India
179
4. CONCLUSION
The device is a considerable improvement over currently available text to speech device. In particular, the
device is easy to use with little or no training used in most situations. The speed of hearing can be set & allow all people
to hear sound clearly. The trainers can easily train blind & deaf people. Thus blind & deaf people can perform their
studies easily. The implementation of text to Braille script can be done using solenoids. With slight modification the
system can be used for dumb people to communicate over telephone.
5. REFERENCES
[1] G.J. Awcock and R Thomas, Applied Image Processing, MacMillan Press Limited, 1995.
[2] Agui T. And Nagao T. Computer Image Processing and Recognition, Tokyo: Shoho-do, 1994.
[3] Gonzalez R.C. and Woods R. E., Digital Image Processing, Addison-Wesley, 1992.
[4] Marr D. And Hildreth, “Theory of edge detection”, Proc. of Royal Society London, B207, 1980, pp. 198-217.
[5] S. Thomas, M. Nageshwar Rao, H. A. Murthy, & C. S. Ramalingam, “Natural sounding speech based on
syallable-like units,” in EUSIPCO, Florence, Italy, 2006.
[6] P. V. S. Rao and R. B. Thosar, “A Programmimg system for studies in speech synthesis,” IEEE Trans.Acoust.,
Speech and Signal Processing , vol. 22 , no. 3, pp. 217-225, 1974.
[7] Sproat, R. And Olive, J. “Text-to-Speech Synthesis” Digital Signal Processing Handbook, Crc Press LLC, 1999.
[8] Mukul Bandodkar, Virat Chourasia, “Low Cost Real-Time Communication Braille.
[9] Hand-Glove for Visually Impaired Using Slot Sensors and Vibration Motors”, International Journal of
Electrical, Robotics, Electronics and Communications Engineering Vol:8, No:6, 2014.
[10] Vineeth Kartha, Dheeraj S. Nair, Sreekant S., Pranoy P. and Dr. P. Jayaprakash, “DRISHTI—A Gesture
Controlled Text to Braille Converter”, IEEE, 2012.
[11] A. A. Supekar, Prof. S. B. Somani and Prof. V.V. Shete, “A Teaching System for Non-Disabled People Who
Communicate with Deaf blind People”, International Journal of Electronics and Communication Engineering &
Technology (IJECET), Volume 4, Issue 4, 2013, pp. 221 - 225, ISSN Print: 0976- 6464, ISSN Online:
0976 –6472.