SlideShare a Scribd company logo
OCRBasedSpeechSynthesis
Bharat Thakur
Electrical & Electronics Engineering
Panjab University, Chandigarh
Bharat.puchd@gmail.com
Introduction
• Speech is more efficient and effective mode of communication as compared to text. In this
Project work OCR Based Speech Synthesis System has been discussed using LabVIEW 2013.
• The OCR application is developed with IMAQ Vision for LabVIEW software- developing tool
and it uses a commercial digital camera from any android phone as image acquisition device.
• Whole project can be divided into 2 parts:
Text to
speech
conversion
Optical
character
recognition
Components of OCR System
• The identity of each symbol is found by comparing the extracted features with descriptions of
the symbol classes obtained through a previous learning phase.
• Finally, contextual information is used to reconstruct the words and numbers of the original
text.
Fig 20. components of OCR system
System Analysis
A OCR based Speech Synthesis System is a computer-based system that should be able to read text
and give voice output, when the text is scanned and submitted to an Optical Character Recognition
(OCR) system.
Hardware Required:
1. Camera
2. Laptop
3. Speaker
Software Platform:
1. NI Labview
2. NI Vision Assistant
Image Acquisition
• The image has been captured using a digital camera from Redmi Note 3 Android Phone.
• The images are transmitted wirelessly to processor using an Android App named “IP Webcam”
Using Internet Protocol using the IP address of the streaming inside the app.
Fig 21. Block Diagram & Front Panel for Image Acquisition
Binarization
Binarization is the process of converting a grayscale image (0 to 255 pixel values) into binary image (0 to1
pixel values) by a threshold value of 175. the pixels lighter than the threshold are turned to white and the
remainder to black pixels.
Fig 22. Binarization
Template Matching
In template matching the written words in the image are segmented and then compared against a set
of character set file with the extension .abc.
This character set file is formed by using NI vision assistant itself.
After we got the
character by character
segmentation we store
the character image in
a structure. This
character as to be
identified for the pre-
defined character set.
Fig 23. Template Matching
Recognition
Fig 24. Recognition
Text to Speech Synthesis
• Speech synthesis is the artificial production of human speech.
• A computer system used for this purpose is called a speech computer or speech synthesizer.
• In text to speech module text recognized by OCR system will be the inputs of speech synthesis
system which is to be converted into speech which can be heard using an earphone connected to
the laptop or using the built in speakers.
• ActiveX is the general name for a set of Microsoft Technologies that allows you to reuse code
Constructor
Property Node
Invoke Node
Assemblies are
implemented in
3 steps
Text to Speech Code
Text to Speak
. The input given to the invoke node “Speak” in the last step is the text that gets converted to speech
and is available as output from the speakers of the laptop.
Speakers
Text to
Speech
OCR
Image
Fig 25. Text to Speech Code
Final Code
Fig 26. Final Code
Fig 27. Steps inside the Vision Assistant
Results and Discussions
• Experiments Suggest that the system has been able to detect the text with high degree of
accuracy (75-80%). However, the efficiency of the systems depends a lot on the size of the font
which is under investigation.
Fig 28. Front Panel for final code
Future Prospects
OCR base Speech recognition system using LabVIEW is an efficient program giving good results for specific fonts
and font sizes, but there is room for improvement.
Future
Prospects
Multi-
Lingual
Educational
Purposes
Translator
Volume
Options
Omni-font
Font sizes
References
[1]www.scientificamerican.com/article/pavement-pounders-at-paris-marathon-generate-power/
[2] COMPARISON OF DIFFERENT BEAM SHAPES FOR PIEZOELECTRIC VIBRATION ENERGY HARVESTING [Maxime Defosseux1*,
Marjolaine Allain, Skandar Basrour, TIMA, UJF-CNRS-Grenoble INP, Grenoble, France]
[3]www.dailymail.co.uk/sciencetech/article-1027362/Britains-eco-nightclub-powered-pounding-feet-opens-doors.html
[4] Pataky TC, Bosch K, Mu T, Keijsers NLW, Segers V, Rosenbaum D, Goulermas JY (2011). An anatomically unbiased foot template for inter-
subject plantar pressure evaluation. Gait and Posture 33(3): 418-422.
[5] Kiran Boby, Aleena Paul K, Anumol.C.V, Josnie Ann Thomas, Nimisha K.K
“Footstep Power Generation Using Piezo Electric Transducers” International Journal of Engineering and Innovative Technology (IJEIT) Volume 3,
Issue 10, April 2014
[6] Landt, Jerry. "Shrouds of Time: The history of RFID," AIM, Inc., 31 May 2006
[7] National Instruments Vision Assistant Manual
[8] D. Klatt, “Review of Text-to-Speech Conversion for English,” Journal of the Acoustical Society of America, JASA vol. 82 (3), pp.737-793, 1987.
[9] ] E. Nunes; E. Abreu; J.C. Metrolho; N. Cardoso; M. Costa; E. Lopes, "Flour quality control using image processing," Industrial Electronics,
2003. ISIE '03. 2003 IEEE International Symposium on , vol.1, no., pp. 594-597 vol. 1, 9-11 June 2003
[10] Van Santen, J. (April 1994). "Assignment of segmental duration in text-to-speech synthesis". Computer Speech & Language 8 (2): 95–128.
doi:10.1006/csla.1994.1005.
THANK YOU

More Related Content

Similar to dic-160603172047.pdf

FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
IRJET Journal
 
Synthesized Speech using a small Microcontroller
Synthesized Speech using a small MicrocontrollerSynthesized Speech using a small Microcontroller
Synthesized Speech using a small Microcontroller
iosrjce
 
N010637794
N010637794N010637794
N010637794
IOSR Journals
 
IRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for BlindIRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for Blind
IRJET Journal
 
Scalable constrained spectral clustering
Scalable constrained spectral clusteringScalable constrained spectral clustering
Scalable constrained spectral clustering
Nishanth Harapanahalli
 
IRJET- Recruitment Chatbot
IRJET- Recruitment ChatbotIRJET- Recruitment Chatbot
IRJET- Recruitment Chatbot
IRJET Journal
 
IRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind PeopleIRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind People
IRJET Journal
 
An Intelligent Chatbot for College Enquiry with Amazon Lex
An Intelligent Chatbot for College Enquiry with Amazon LexAn Intelligent Chatbot for College Enquiry with Amazon Lex
An Intelligent Chatbot for College Enquiry with Amazon Lex
IRJET Journal
 
Aspect Oriented Software Development
Aspect Oriented Software DevelopmentAspect Oriented Software Development
Aspect Oriented Software Development
Jignesh Patel
 
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET Journal
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
Istvan Rath
 
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...
IRJET Journal
 
MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...
MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...
MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...
Mohammed Moufti
 
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry PiIRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET Journal
 
An Optimising Compiler For Generated Tiny Virtual Machines
An Optimising Compiler For Generated Tiny Virtual MachinesAn Optimising Compiler For Generated Tiny Virtual Machines
An Optimising Compiler For Generated Tiny Virtual Machines
Leslie Schulte
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
ijiert bestjournal
 
Hemanth_Krishnan_resume
Hemanth_Krishnan_resumeHemanth_Krishnan_resume
Hemanth_Krishnan_resume
Hemanth Krishnan
 
Online java compiler with security editor
Online java compiler with security editorOnline java compiler with security editor
Online java compiler with security editor
IRJET Journal
 
IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...
IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...
IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...
IRJET Journal
 
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind PersonsIRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET Journal
 

Similar to dic-160603172047.pdf (20)

FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
 
Synthesized Speech using a small Microcontroller
Synthesized Speech using a small MicrocontrollerSynthesized Speech using a small Microcontroller
Synthesized Speech using a small Microcontroller
 
N010637794
N010637794N010637794
N010637794
 
IRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for BlindIRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for Blind
 
Scalable constrained spectral clustering
Scalable constrained spectral clusteringScalable constrained spectral clustering
Scalable constrained spectral clustering
 
IRJET- Recruitment Chatbot
IRJET- Recruitment ChatbotIRJET- Recruitment Chatbot
IRJET- Recruitment Chatbot
 
IRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind PeopleIRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind People
 
An Intelligent Chatbot for College Enquiry with Amazon Lex
An Intelligent Chatbot for College Enquiry with Amazon LexAn Intelligent Chatbot for College Enquiry with Amazon Lex
An Intelligent Chatbot for College Enquiry with Amazon Lex
 
Aspect Oriented Software Development
Aspect Oriented Software DevelopmentAspect Oriented Software Development
Aspect Oriented Software Development
 
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...
 
MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...
MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...
MGC_DVCon_13_Easy_Steps_Towards_Virtual_Prototyping_Using_the_SystemVerilog_D...
 
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry PiIRJET- Text Reading for Visually Impaired Person using Raspberry Pi
IRJET- Text Reading for Visually Impaired Person using Raspberry Pi
 
An Optimising Compiler For Generated Tiny Virtual Machines
An Optimising Compiler For Generated Tiny Virtual MachinesAn Optimising Compiler For Generated Tiny Virtual Machines
An Optimising Compiler For Generated Tiny Virtual Machines
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
 
Hemanth_Krishnan_resume
Hemanth_Krishnan_resumeHemanth_Krishnan_resume
Hemanth_Krishnan_resume
 
Online java compiler with security editor
Online java compiler with security editorOnline java compiler with security editor
Online java compiler with security editor
 
IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...
IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...
IRJET- Generation of HTML Code using Machine Learning Techniques from Mock-Up...
 
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind PersonsIRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
 

Recently uploaded

Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
aymanquadri279
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

dic-160603172047.pdf

  • 1. OCRBasedSpeechSynthesis Bharat Thakur Electrical & Electronics Engineering Panjab University, Chandigarh Bharat.puchd@gmail.com
  • 2. Introduction • Speech is more efficient and effective mode of communication as compared to text. In this Project work OCR Based Speech Synthesis System has been discussed using LabVIEW 2013. • The OCR application is developed with IMAQ Vision for LabVIEW software- developing tool and it uses a commercial digital camera from any android phone as image acquisition device. • Whole project can be divided into 2 parts: Text to speech conversion Optical character recognition
  • 3. Components of OCR System • The identity of each symbol is found by comparing the extracted features with descriptions of the symbol classes obtained through a previous learning phase. • Finally, contextual information is used to reconstruct the words and numbers of the original text. Fig 20. components of OCR system
  • 4. System Analysis A OCR based Speech Synthesis System is a computer-based system that should be able to read text and give voice output, when the text is scanned and submitted to an Optical Character Recognition (OCR) system. Hardware Required: 1. Camera 2. Laptop 3. Speaker Software Platform: 1. NI Labview 2. NI Vision Assistant
  • 5. Image Acquisition • The image has been captured using a digital camera from Redmi Note 3 Android Phone. • The images are transmitted wirelessly to processor using an Android App named “IP Webcam” Using Internet Protocol using the IP address of the streaming inside the app. Fig 21. Block Diagram & Front Panel for Image Acquisition
  • 6. Binarization Binarization is the process of converting a grayscale image (0 to 255 pixel values) into binary image (0 to1 pixel values) by a threshold value of 175. the pixels lighter than the threshold are turned to white and the remainder to black pixels. Fig 22. Binarization
  • 7. Template Matching In template matching the written words in the image are segmented and then compared against a set of character set file with the extension .abc. This character set file is formed by using NI vision assistant itself. After we got the character by character segmentation we store the character image in a structure. This character as to be identified for the pre- defined character set. Fig 23. Template Matching
  • 9. Text to Speech Synthesis • Speech synthesis is the artificial production of human speech. • A computer system used for this purpose is called a speech computer or speech synthesizer. • In text to speech module text recognized by OCR system will be the inputs of speech synthesis system which is to be converted into speech which can be heard using an earphone connected to the laptop or using the built in speakers. • ActiveX is the general name for a set of Microsoft Technologies that allows you to reuse code Constructor Property Node Invoke Node Assemblies are implemented in 3 steps
  • 10. Text to Speech Code Text to Speak . The input given to the invoke node “Speak” in the last step is the text that gets converted to speech and is available as output from the speakers of the laptop. Speakers Text to Speech OCR Image Fig 25. Text to Speech Code
  • 11. Final Code Fig 26. Final Code Fig 27. Steps inside the Vision Assistant
  • 12. Results and Discussions • Experiments Suggest that the system has been able to detect the text with high degree of accuracy (75-80%). However, the efficiency of the systems depends a lot on the size of the font which is under investigation. Fig 28. Front Panel for final code
  • 13. Future Prospects OCR base Speech recognition system using LabVIEW is an efficient program giving good results for specific fonts and font sizes, but there is room for improvement. Future Prospects Multi- Lingual Educational Purposes Translator Volume Options Omni-font Font sizes
  • 14. References [1]www.scientificamerican.com/article/pavement-pounders-at-paris-marathon-generate-power/ [2] COMPARISON OF DIFFERENT BEAM SHAPES FOR PIEZOELECTRIC VIBRATION ENERGY HARVESTING [Maxime Defosseux1*, Marjolaine Allain, Skandar Basrour, TIMA, UJF-CNRS-Grenoble INP, Grenoble, France] [3]www.dailymail.co.uk/sciencetech/article-1027362/Britains-eco-nightclub-powered-pounding-feet-opens-doors.html [4] Pataky TC, Bosch K, Mu T, Keijsers NLW, Segers V, Rosenbaum D, Goulermas JY (2011). An anatomically unbiased foot template for inter- subject plantar pressure evaluation. Gait and Posture 33(3): 418-422. [5] Kiran Boby, Aleena Paul K, Anumol.C.V, Josnie Ann Thomas, Nimisha K.K “Footstep Power Generation Using Piezo Electric Transducers” International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 10, April 2014 [6] Landt, Jerry. "Shrouds of Time: The history of RFID," AIM, Inc., 31 May 2006 [7] National Instruments Vision Assistant Manual [8] D. Klatt, “Review of Text-to-Speech Conversion for English,” Journal of the Acoustical Society of America, JASA vol. 82 (3), pp.737-793, 1987. [9] ] E. Nunes; E. Abreu; J.C. Metrolho; N. Cardoso; M. Costa; E. Lopes, "Flour quality control using image processing," Industrial Electronics, 2003. ISIE '03. 2003 IEEE International Symposium on , vol.1, no., pp. 594-597 vol. 1, 9-11 June 2003 [10] Van Santen, J. (April 1994). "Assignment of segmental duration in text-to-speech synthesis". Computer Speech & Language 8 (2): 95–128. doi:10.1006/csla.1994.1005.