The Project is based on design & implementation of smart hybrid system for street sign boards recognition, text and speech conversions through character extraction and symbol matching. The default language use to pronounce signs on the street boards is English. Here we are proposing a novel method to convert identified character or symbol into multiple languages like Hindi, Marathi, Urdu, etc. This Project is helpful to all starting from the visually impaired, the tourists, the illiterates and all the people who travel. The system is accomplished with the speech pronunciation in different languages and to display on screen. This Project has a multidisciplinary approach as it belongs to the domains like computer vision, speech processing, & Google cloud platform. Computer vision is used for character and symbol extraction from sign boards. Speech processing is used for text to speech conversion. GCP is used for multiple language conversion of original extracted text. Further programming is done for real time pronunciation and displaying desired output.
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...IRJET Journal
This document describes a system that aims to help visually impaired persons by allowing them to read text using voice assistance and control smart home devices with voice commands using a Raspberry Pi. The system uses optical character recognition (Tesseract OCR) to extract text from images, which is then converted to speech using e-Speak. It also enables control of smart sockets using Google Home voice recognition on the Raspberry Pi. The goal is to help blind people read text independently and control electrical devices safely using only their voice.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Design of a Communication System using Sign Language aid for Differently Able...IRJET Journal
This document describes a proposed system to design a communication system using sign language to aid differently abled people. The system aims to use image processing and artificial intelligence techniques to recognize characters in sign language from video input and convert them to text and speech output. It discusses technologies like blob detection, skin color recognition and template matching that would be used for sign recognition. The system is intended to help deaf and mute people communicate by translating their sign language to a format understandable by others.
IRJET - Sign Language Recognition using Neural NetworkIRJET Journal
This document presents a system for sign language recognition using neural networks. The system aims to recognize hand gestures in real-time and translate them into English words or sentences. It uses a convolutional neural network (CNN) algorithm to extract features from captured images of hand gestures and classify the gestures. The system was able to accurately recognize gestures with a classification rate of 92.4%. The system could help mute individuals communicate through translating sign language into text that could be read or understood by others. It may also assist blind individuals by allowing communication through speech recognition of the translated text.
Speech Automated Examination for Visually Impaired Studentsvivatechijri
We that know education is not a luxury but a necessity in today's life. It's a human right and every individual including blind and visually impaired people should also take it. It is very difficult for them to manage their education and learning process. According to recent study it has been found that over 200+ million blind people and visually impaired people are there and the number is growing. so the user have develop a learning application in which a blind or visually impaired student can study on their own without including third party in it. The student will be able to separately operate the application. Here they will be able to study various subjects of their choice, they will be able to take their own notes, and also take tests on the subject. The prime goal of this application is to solve the problem of the blind students regarding their education or studying pattern. Here they will be able to move the cursor anywhere on the desktop and the text will be dictated in the human synthesized AI voice using TTS conversion. They can easily make notes by simply dictating their points to the system, and the system will be convert into text format and also we can listened to it in an audio format. Once the learning part is done, and the student have to revise the particular topic then he/she can give test on that topic by using hand gestures. So here the system will help the visually impaired/blind student to learn and operate the system independently and also they will be able to use it as per their suitable time preference. So, the system is developed to resolve the problem of students being dependent on the third party in their studies and to increase their interest in studies. Hence, not only blind people but also other handicapped people will be able to use this application.
HCI BASED APPLICATION FOR PLAYING COMPUTER GAMES | J4RV4I1014Journal For Research
This paper describes a command interface for games based on hand gestures and voice command defined by postures, movement and location. The system uses computer vision requiring no sensors or markers by the user. In voice command the speech recognizer, recognize the input from the user. It stores and passes command to the game, action takes place. We propose a simple architecture for performing real time colour detection and motion tracking using a webcam. The next step is to track the motion of the specified colours and the resulting actions are given as input commands to the system. We specify blue colour for motion tracking and green colour for mouse pointer. The speech recognition is the process of automatically recognizing a certain word spoken by a particular speaker based on individual information included in speech waves. This application will help in reduction in hardware requirement and can be implemented in other electronic devices also.
Designing the Workflow of a Language Interpretation Device Using Artificial I...IOSR Journals
This document describes a proposed design for a language interpretation device that can automatically interpret between multiple languages in real-time. It discusses two stages: a training stage where the device's central processing unit is trained on speech recognition and language patterns, and a working stage where it can learn from interactions and improve over time. The key components described are a CPU, memory, voice synthesizer, display, microphone and speaker. An algorithm is proposed for the workflow, including recognizing the speaker's voice, applying speech recognition, matching to stored language patterns, and generating the translated output. Future work could involve implementing human-robot interaction capabilities to make the device more user-friendly. The goal is to develop a device that can interpret languages simultaneously rather than
1) The document discusses a sign language recognition system that uses a webcam to capture images of hand gestures.
2) The captured images are processed to extract features and detect edges and peaks, which are then used as input for a classification algorithm to recognize the gestures.
3) The goal is to design an accurate system that allows deaf and mute people to communicate with others without an interpreter by generating speech or text from their hand signs.
IRJET- Voice Assisted Text Reading and Google Home Smart Socket Control Syste...IRJET Journal
This document describes a system that aims to help visually impaired persons by allowing them to read text using voice assistance and control smart home devices with voice commands using a Raspberry Pi. The system uses optical character recognition (Tesseract OCR) to extract text from images, which is then converted to speech using e-Speak. It also enables control of smart sockets using Google Home voice recognition on the Raspberry Pi. The goal is to help blind people read text independently and control electrical devices safely using only their voice.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Design of a Communication System using Sign Language aid for Differently Able...IRJET Journal
This document describes a proposed system to design a communication system using sign language to aid differently abled people. The system aims to use image processing and artificial intelligence techniques to recognize characters in sign language from video input and convert them to text and speech output. It discusses technologies like blob detection, skin color recognition and template matching that would be used for sign recognition. The system is intended to help deaf and mute people communicate by translating their sign language to a format understandable by others.
IRJET - Sign Language Recognition using Neural NetworkIRJET Journal
This document presents a system for sign language recognition using neural networks. The system aims to recognize hand gestures in real-time and translate them into English words or sentences. It uses a convolutional neural network (CNN) algorithm to extract features from captured images of hand gestures and classify the gestures. The system was able to accurately recognize gestures with a classification rate of 92.4%. The system could help mute individuals communicate through translating sign language into text that could be read or understood by others. It may also assist blind individuals by allowing communication through speech recognition of the translated text.
Speech Automated Examination for Visually Impaired Studentsvivatechijri
We that know education is not a luxury but a necessity in today's life. It's a human right and every individual including blind and visually impaired people should also take it. It is very difficult for them to manage their education and learning process. According to recent study it has been found that over 200+ million blind people and visually impaired people are there and the number is growing. so the user have develop a learning application in which a blind or visually impaired student can study on their own without including third party in it. The student will be able to separately operate the application. Here they will be able to study various subjects of their choice, they will be able to take their own notes, and also take tests on the subject. The prime goal of this application is to solve the problem of the blind students regarding their education or studying pattern. Here they will be able to move the cursor anywhere on the desktop and the text will be dictated in the human synthesized AI voice using TTS conversion. They can easily make notes by simply dictating their points to the system, and the system will be convert into text format and also we can listened to it in an audio format. Once the learning part is done, and the student have to revise the particular topic then he/she can give test on that topic by using hand gestures. So here the system will help the visually impaired/blind student to learn and operate the system independently and also they will be able to use it as per their suitable time preference. So, the system is developed to resolve the problem of students being dependent on the third party in their studies and to increase their interest in studies. Hence, not only blind people but also other handicapped people will be able to use this application.
HCI BASED APPLICATION FOR PLAYING COMPUTER GAMES | J4RV4I1014Journal For Research
This paper describes a command interface for games based on hand gestures and voice command defined by postures, movement and location. The system uses computer vision requiring no sensors or markers by the user. In voice command the speech recognizer, recognize the input from the user. It stores and passes command to the game, action takes place. We propose a simple architecture for performing real time colour detection and motion tracking using a webcam. The next step is to track the motion of the specified colours and the resulting actions are given as input commands to the system. We specify blue colour for motion tracking and green colour for mouse pointer. The speech recognition is the process of automatically recognizing a certain word spoken by a particular speaker based on individual information included in speech waves. This application will help in reduction in hardware requirement and can be implemented in other electronic devices also.
Designing the Workflow of a Language Interpretation Device Using Artificial I...IOSR Journals
This document describes a proposed design for a language interpretation device that can automatically interpret between multiple languages in real-time. It discusses two stages: a training stage where the device's central processing unit is trained on speech recognition and language patterns, and a working stage where it can learn from interactions and improve over time. The key components described are a CPU, memory, voice synthesizer, display, microphone and speaker. An algorithm is proposed for the workflow, including recognizing the speaker's voice, applying speech recognition, matching to stored language patterns, and generating the translated output. Future work could involve implementing human-robot interaction capabilities to make the device more user-friendly. The goal is to develop a device that can interpret languages simultaneously rather than
1) The document discusses a sign language recognition system that uses a webcam to capture images of hand gestures.
2) The captured images are processed to extract features and detect edges and peaks, which are then used as input for a classification algorithm to recognize the gestures.
3) The goal is to design an accurate system that allows deaf and mute people to communicate with others without an interpreter by generating speech or text from their hand signs.
This document provides information about Braille, including its history and key inventor, Louis Braille. It describes the Braille system and cells, and various technologies that have been developed to help blind and visually impaired people read and write using Braille, including Braille displays, electronic notetakers, keyboards, software, printers, and other aids. It also discusses organizations in the UAE that support the blind such as Emirates Association for the Blind, Tamkeen Training Centre, UAE National Committee for Eye Health, and Foresight. National Braille Week is also summarized.
Implementation of humanoid robot with using the concept of synthetic braineSAT Journals
Abstract This paper is elaborate the model of humanoid robot interacts with human being and perform various operation as per the command given by the human being. A humanoid robot having Synthetic brain can able to do Interaction, communication, Object detection, information acquisition about any object, response to voice command, chatting logically with human beings. Object detection will be done by this robot for that purpose there is use image processing concept (HAAR Technique), And to make the system intelligent that is whenever system interact, communicate, chat with human it gives proper response, question / answers there is integrates artificial intelligence and DFA / NFA automata and Prolog language concept for answering logically over the complex and relevant strings or data. Keywords —Humanoid Robotics, Artificial Intelligence, Image Processing, Audio Filtering.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document discusses sign language recognition and various technologies used for it. It introduces image processing, sign language, and hand gesture recognition. It then discusses popular deep learning frameworks like TensorFlow and OpenCV. Literature on existing sign language recognition systems is reviewed along with the proposed system architecture. The organization of the thesis is outlined covering introduction, literature survey, methodology, design, experiments, and conclusion.
Speech Based Search Engine System Control and User InteractionIOSR Journals
The document discusses a proposed speech-based search engine system to help visually impaired users interact with computers without keyboards or mice. The system would allow users to control the computer and search for information using only voice commands through speech recognition and synthesis technologies. It aims to make computers more accessible for visually impaired people and others who have difficulty using keyboards and mice. The proposed system could provide educational benefits and allow independent computer use through spoken interactions. A feasibility analysis found the system would be economically and technically feasible to implement using existing hardware, software and open-source technologies.
This document discusses developments in voice recognition technology. It begins by introducing voice recognition software and its goal of allowing users to efficiently control computers through speech. It then outlines the objectives of the paper, which are to explain the importance of voice recognition, detail research and development, discuss existing problems and solutions, and analyze the impact on engineering and society. The document proceeds to describe how voice recognition systems work, including acoustic and language models. It discusses applications and importance in learning, consumer, corporate, and government uses. Finally, it outlines current flaws in voice recognition software and discusses improvement solutions being developed.
A Translation Device for the Vision Based Sign Languageijsrd.com
The Sign language is very important for people who have hearing and speaking deficiency generally called Deaf and Mute. It is the only mode of communication for such people to convey their messages and it becomes very important for people to understand their language. This paper proposes the method or algorithm for an application which would help in recognizing the different signs which is called Indian Sign Language. The images are of the palm side of right and left hand and are loaded at runtime. The method has been developed with respect to single user. The real time images will be captured first and then stored in directory and on recently captured image and feature extraction will take place to identify which sign has been articulated by the user through SIFT(scale invariance Fourier transform) algorithm. The comparisons will be performed in arrears and then after comparison the result will be produced in accordance through matched key points from the input image to the image stored for a specific letter already in the directory or the database the outputs for the following can be seen in below sections. There are 26 signs in Indian Sign Language corresponding to each alphabet out which the proposed algorithm provided with 95% accurate results for 9 alphabets with their images captured at every possible angle and distance.
IRJET - A Robust Sign Language and Hand Gesture Recognition System using Conv...IRJET Journal
This document presents a robust sign language and hand gesture recognition system using convolutional neural networks. The system captures image frames and processes them through various neural network layers to classify hand gestures into letters, numbers, or other symbols. It segments the hand from images using color thresholds and edge detection. The images then undergo preprocessing like resizing before being classified by the CNN. The CNN is trained on a large dataset to accurately recognize gestures in sign language and provide text output to bridge communication between deaf and non-signing individuals. The system achieved good results classifying several alphabet letters but could be expanded to recognize word combinations.
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
The specified subfield of computational linguistics and computer science can said to be linked with speech recognition. Speech recognition can develop new variation technologies as well as methodologies generated as interdisciplinary concept. It can be considered to translate and recognize and satisfy the capability towards understanding and translating the words that are already spoken. It is more preciously said that in the most recent times this field has secured positive feedback by intense learning of voice recognition. Such evidences shows the proof that it has more market demand for implementing the application of specific data as voice recognition. Deployment of speech recognition systems can be utilized as the evidence shown to its analyzing methods that is helpful for designing each and every individual’s future. It is said that the computer plays an important role for this process as by this all the translated words can be acknowledged by the texts also. Vishal Dineshkumar Soni 2019. Speech Recognition: Transcription and transformation of human speech . International Journal on Integrated Education. 2, 6 (Dec. 2019), 257-262. DOI:https://doi.org/10.31149/ijie.v2i6.497. Pdf Url : https://journals.researchparks.org/index.php/IJIE/article/view/497/478 Paper Url : https://journals.researchparks.org/index.php/IJIE/article/view/497
IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...IRJET Journal
This document describes a communication system for blind, deaf, and mute people using Internet of Things technologies. The system addresses the issues of each group (blind, deaf, mute) with different techniques in a single system. It can convert images to text and speech for blind people. For deaf people, it converts speech to text. And for mute people, it converts gestures to text and speech. The system was implemented using Python and utilizes technologies like OpenCV, Tesseract OCR, and text-to-speech to enable communication across the groups. It achieved the goals of providing an independent lifestyle and a way to communicate for people with these disabilities.
The role of speech technology in biometrics, forensics and man-machine interfaceIJECEIAES
Day by day Optimism is growing that in the near future our society will witness the Man-Machine Interface (MMI) using voice technology. Computer manufacturers are building voice recognition sub-systems in their new product lines. Although, speech technology based MMI technique is widely used before, needs to gather and apply the deep knowledge of spoken language and performance during the electronic machine-based interaction. Biometric recognition refers to a system that is able to identify individuals based on their own behavior and biological characteristics. Fingerprint success in forensic science and law enforcement applications with growing concerns relating to border control, banking access fraud, machine access control and IT security, there has been great interest in the use of fingerprints and other biological symptoms for the automatic recognition. It is not surprising to see that the application of biometric systems is playing an important role in all areas of our society. Biometric applications include access to smartphone security, mobile payment, the international border, national citizen register and reserve facilities. The use of MMI by speech technology, which includes automated speech/speaker recognition and natural language processing, has the significant impact on all existing businesses based on personal computer applications. With the help of powerful and affordable microprocessors and artificial intelligence algorithms, the human being can talk to the machine to drive and control all computer-based applications. Today's applications show a small preview of a rich future for MMI based on voice technology, which will ultimately replace the keyboard and mouse with the microphone for easy access and make the machine more intelligent.
IRJET- Hand Gesture based Recognition using CNN MethodologyIRJET Journal
This document summarizes a research paper on hand gesture recognition using convolutional neural networks (CNN). The paper aims to develop a system to recognize American Sign Language (ASL) to help facilitate communication for deaf individuals. The system would capture hand gestures via video and translate them into text. The researchers conducted a literature review on previous work using CNNs and 3D convolutional models for sign language recognition. They intend to implement a 3D CNN model on ASL data and analyze the results to improve recognition accuracy for communicating via sign language.
Optical character recognition for Ge'ez charactershadmac
This document is a project proposal for an OCR algorithm for Ge'ez characters. It aims to build a system to identify and digitize Ethiopian documents to preserve intellectual property and make it accessible digitally. The proposed system would use OpenCV and neural networks to recognize Ge'ez fonts in scanned documents and convert them to editable text files. It would train on Ge'ez, Amharic and Tigrigna words to recognize and correct errors in characters. The methodology involves preprocessing scans, segmenting text from images, extracting character features, recognizing words, and outputting final text files. The schedule plans for data collection, requirements, design, implementation and results over 3-4 months.
Paper Design and Simulation of Communication Aid For Disabled based on region...Syambabu Choda
This document describes the design and simulation of a communication aid for disabled people based on the Telugu regional language. It discusses how sign language is an important form of communication for disabled individuals like the deaf, dumb, and blind. The proposed system uses image processing and artificial neural networks to capture Telugu sign language gestures and convert them into text that can be displayed or spoken aloud. This allows disabled individuals to communicate both with each other using regional sign language and with non-signers by translating gestures into a readable or audible format in real-time. The system aims to help overcome communication barriers faced by the disabled.
An embedded module as “virtual tongue”ijistjournal
There are several human disabilities in nature of which speech impaired people find difficulty in
communicating with others, which is very important to convey their messages without speech. In this
paper, to make them self reliable and independent, with the advent of embedded systems technology an
embedded handheld icon based assistive device as “Virtual Tongue” for Voiceless, which speaks for
severely speech disordered people by simply pressing icons appropriately to fulfill their needs, is proposed.
This proposed module comprises a microcontroller based player to play voice messages, Secure Digital
(SD) card reader, Universal Serial Bus (USB) port, icon based remote keypad, audio amplifier and speaker
along with the benefits like portable, reliable, user friendly, affordable cost, low power consumption and of
course speech in regional language with good clarity. The proposed system is designed to produce speech
regardless of time length, audible to the neighbors, based on the request from the user by pressing the icons
thereby this module deserves inarticulate people. An extended version with a feature of converting text into
voice by adding a circuit, with which any text fed through a keyboard can be converted into speech, is also
discussed.
This document provides an introduction to character recognition and optical character recognition (OCR). It discusses the purpose and history of OCR, including early technologies from the 1910s-1930s. It also covers the scope, technology used, and how to use OCR software. Finally, it discusses the feasibility study for an OCR project, including technical, operational, and economic feasibility. The overall purpose is to develop an efficient OCR software system to convert paper documents to electronic format for improved document processing and searchability.
This document summarizes a research paper that presents an optical character recognition system for isolated Arabic characters using a DSP-based hardware implementation. The system uses a fuzzy ART neural network for character recognition after extracting features from segmented characters. Testing achieved a 95% recognition rate on 700 sample characters from various fonts and sizes. The system was implemented on a TI TMS320C6416T digital signal processor for high-speed performance with small size and low power consumption. Future work could expand the system to recognize connected characters and directly read images rather than relying on pre-processing in MATLAB.
speech recognition,History of speech recognition,what is speech recognition,Voice recognition software , Advantages and Disadvantages speech recognition, voice recognition,Voice recognition in operating systems ,Types of speech recognition
Handwritten Text Recognition and Digital Text Conversionijtsrd
Sometimes it is extremely difficult to secure handwritten documents in the real world. While doing so, we may encounter many problems such as misplacing the documents, unavailability of access from anywhere, physical damage, etc. So, to keep the information secure, we convert that information into digital format to address all the above mentioned problems. The main aim of our application is to recognize hand written text and display it in digital text format. Image processing is very significant process for data analysis these days. In image processing, the visible text from the real world as input must be processed precisely in order to produce the same information as output with accuracy. To do this, the text present in the image must be recognized by the system accurately. The proposed system aims at achieving these results. The process goes in this way The image which contains the handwritten text is fed to the system is passed into neural network which recognizes the handwritten text present in the image and displays it in the form of digital text. This can be used for many purposes such as copying the digital text for using it elsewhere, producing formal documents and can also be used as input for data processing. Using this process, we can store the information in a secure way, we can access the information from anywhere or at any time and there is no scope for physical damage as the information is in digital format. Mr. B. Ravinder Reddy | J. Nandini | P. Sowmya | Y. Sathwik ""Handwritten Text Recognition and Digital Text Conversion"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23508.pdf
Paper URL: https://www.ijtsrd.com/computer-science/data-processing/23508/handwritten-text-recognition-and-digital-text-conversion/mr-b-ravinder-reddy
Voice input and speech recognition system in tourism/social mediacidroypaes
Voice input devices allow users to input data or commands using speech instead of other input methods like keyboards. Some voice input devices recognize words from a predefined vocabulary while others need to be trained for a specific speaker. When a word is spoken, the matching input is displayed on screen for verification.
Speech recognition is the process of converting spoken language to text using computer programs. It draws from linguistics, computer science, and electrical engineering. Applications include voice assistants, dictation software, call routing, and more. Accuracy depends on factors like vocabulary size, presence of similar sounding words, whether the system is designed for one speaker or many, and whether speech is isolated, connected or continuous.
The document describes a proposed voice-based email system for blind users. The system would use speech recognition to allow users to compose and send emails solely through voice commands. It would also use text-to-speech to read incoming emails aloud. The system aims to make email more accessible for blind and visually impaired users by eliminating the need to use keyboards. It could also help illiterate users. The document outlines the objectives, modules, algorithms, and technologies used in the proposed system, such as speech-to-text, text-to-speech, and interactive voice response.
Text Detection and Recognition with Speech Output for Visually Challenged Per...IJERA Editor
The document reviews existing systems that aim to assist visually impaired persons by detecting and recognizing text from images and converting it to speech. It discusses how optical character recognition and text-to-speech technologies have been used to develop applications like newspaper reading systems, signage recognition systems, and camera-based text reading systems. The document also summarizes various text detection and recognition methods that have been used, such as gradient feature-based, color segmentation-based, texture feature-based, and layout analysis-based approaches.
This document provides information about Braille, including its history and key inventor, Louis Braille. It describes the Braille system and cells, and various technologies that have been developed to help blind and visually impaired people read and write using Braille, including Braille displays, electronic notetakers, keyboards, software, printers, and other aids. It also discusses organizations in the UAE that support the blind such as Emirates Association for the Blind, Tamkeen Training Centre, UAE National Committee for Eye Health, and Foresight. National Braille Week is also summarized.
Implementation of humanoid robot with using the concept of synthetic braineSAT Journals
Abstract This paper is elaborate the model of humanoid robot interacts with human being and perform various operation as per the command given by the human being. A humanoid robot having Synthetic brain can able to do Interaction, communication, Object detection, information acquisition about any object, response to voice command, chatting logically with human beings. Object detection will be done by this robot for that purpose there is use image processing concept (HAAR Technique), And to make the system intelligent that is whenever system interact, communicate, chat with human it gives proper response, question / answers there is integrates artificial intelligence and DFA / NFA automata and Prolog language concept for answering logically over the complex and relevant strings or data. Keywords —Humanoid Robotics, Artificial Intelligence, Image Processing, Audio Filtering.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document discusses sign language recognition and various technologies used for it. It introduces image processing, sign language, and hand gesture recognition. It then discusses popular deep learning frameworks like TensorFlow and OpenCV. Literature on existing sign language recognition systems is reviewed along with the proposed system architecture. The organization of the thesis is outlined covering introduction, literature survey, methodology, design, experiments, and conclusion.
Speech Based Search Engine System Control and User InteractionIOSR Journals
The document discusses a proposed speech-based search engine system to help visually impaired users interact with computers without keyboards or mice. The system would allow users to control the computer and search for information using only voice commands through speech recognition and synthesis technologies. It aims to make computers more accessible for visually impaired people and others who have difficulty using keyboards and mice. The proposed system could provide educational benefits and allow independent computer use through spoken interactions. A feasibility analysis found the system would be economically and technically feasible to implement using existing hardware, software and open-source technologies.
This document discusses developments in voice recognition technology. It begins by introducing voice recognition software and its goal of allowing users to efficiently control computers through speech. It then outlines the objectives of the paper, which are to explain the importance of voice recognition, detail research and development, discuss existing problems and solutions, and analyze the impact on engineering and society. The document proceeds to describe how voice recognition systems work, including acoustic and language models. It discusses applications and importance in learning, consumer, corporate, and government uses. Finally, it outlines current flaws in voice recognition software and discusses improvement solutions being developed.
A Translation Device for the Vision Based Sign Languageijsrd.com
The Sign language is very important for people who have hearing and speaking deficiency generally called Deaf and Mute. It is the only mode of communication for such people to convey their messages and it becomes very important for people to understand their language. This paper proposes the method or algorithm for an application which would help in recognizing the different signs which is called Indian Sign Language. The images are of the palm side of right and left hand and are loaded at runtime. The method has been developed with respect to single user. The real time images will be captured first and then stored in directory and on recently captured image and feature extraction will take place to identify which sign has been articulated by the user through SIFT(scale invariance Fourier transform) algorithm. The comparisons will be performed in arrears and then after comparison the result will be produced in accordance through matched key points from the input image to the image stored for a specific letter already in the directory or the database the outputs for the following can be seen in below sections. There are 26 signs in Indian Sign Language corresponding to each alphabet out which the proposed algorithm provided with 95% accurate results for 9 alphabets with their images captured at every possible angle and distance.
IRJET - A Robust Sign Language and Hand Gesture Recognition System using Conv...IRJET Journal
This document presents a robust sign language and hand gesture recognition system using convolutional neural networks. The system captures image frames and processes them through various neural network layers to classify hand gestures into letters, numbers, or other symbols. It segments the hand from images using color thresholds and edge detection. The images then undergo preprocessing like resizing before being classified by the CNN. The CNN is trained on a large dataset to accurately recognize gestures in sign language and provide text output to bridge communication between deaf and non-signing individuals. The system achieved good results classifying several alphabet letters but could be expanded to recognize word combinations.
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
The specified subfield of computational linguistics and computer science can said to be linked with speech recognition. Speech recognition can develop new variation technologies as well as methodologies generated as interdisciplinary concept. It can be considered to translate and recognize and satisfy the capability towards understanding and translating the words that are already spoken. It is more preciously said that in the most recent times this field has secured positive feedback by intense learning of voice recognition. Such evidences shows the proof that it has more market demand for implementing the application of specific data as voice recognition. Deployment of speech recognition systems can be utilized as the evidence shown to its analyzing methods that is helpful for designing each and every individual’s future. It is said that the computer plays an important role for this process as by this all the translated words can be acknowledged by the texts also. Vishal Dineshkumar Soni 2019. Speech Recognition: Transcription and transformation of human speech . International Journal on Integrated Education. 2, 6 (Dec. 2019), 257-262. DOI:https://doi.org/10.31149/ijie.v2i6.497. Pdf Url : https://journals.researchparks.org/index.php/IJIE/article/view/497/478 Paper Url : https://journals.researchparks.org/index.php/IJIE/article/view/497
IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...IRJET Journal
This document describes a communication system for blind, deaf, and mute people using Internet of Things technologies. The system addresses the issues of each group (blind, deaf, mute) with different techniques in a single system. It can convert images to text and speech for blind people. For deaf people, it converts speech to text. And for mute people, it converts gestures to text and speech. The system was implemented using Python and utilizes technologies like OpenCV, Tesseract OCR, and text-to-speech to enable communication across the groups. It achieved the goals of providing an independent lifestyle and a way to communicate for people with these disabilities.
The role of speech technology in biometrics, forensics and man-machine interfaceIJECEIAES
Day by day Optimism is growing that in the near future our society will witness the Man-Machine Interface (MMI) using voice technology. Computer manufacturers are building voice recognition sub-systems in their new product lines. Although, speech technology based MMI technique is widely used before, needs to gather and apply the deep knowledge of spoken language and performance during the electronic machine-based interaction. Biometric recognition refers to a system that is able to identify individuals based on their own behavior and biological characteristics. Fingerprint success in forensic science and law enforcement applications with growing concerns relating to border control, banking access fraud, machine access control and IT security, there has been great interest in the use of fingerprints and other biological symptoms for the automatic recognition. It is not surprising to see that the application of biometric systems is playing an important role in all areas of our society. Biometric applications include access to smartphone security, mobile payment, the international border, national citizen register and reserve facilities. The use of MMI by speech technology, which includes automated speech/speaker recognition and natural language processing, has the significant impact on all existing businesses based on personal computer applications. With the help of powerful and affordable microprocessors and artificial intelligence algorithms, the human being can talk to the machine to drive and control all computer-based applications. Today's applications show a small preview of a rich future for MMI based on voice technology, which will ultimately replace the keyboard and mouse with the microphone for easy access and make the machine more intelligent.
IRJET- Hand Gesture based Recognition using CNN MethodologyIRJET Journal
This document summarizes a research paper on hand gesture recognition using convolutional neural networks (CNN). The paper aims to develop a system to recognize American Sign Language (ASL) to help facilitate communication for deaf individuals. The system would capture hand gestures via video and translate them into text. The researchers conducted a literature review on previous work using CNNs and 3D convolutional models for sign language recognition. They intend to implement a 3D CNN model on ASL data and analyze the results to improve recognition accuracy for communicating via sign language.
Optical character recognition for Ge'ez charactershadmac
This document is a project proposal for an OCR algorithm for Ge'ez characters. It aims to build a system to identify and digitize Ethiopian documents to preserve intellectual property and make it accessible digitally. The proposed system would use OpenCV and neural networks to recognize Ge'ez fonts in scanned documents and convert them to editable text files. It would train on Ge'ez, Amharic and Tigrigna words to recognize and correct errors in characters. The methodology involves preprocessing scans, segmenting text from images, extracting character features, recognizing words, and outputting final text files. The schedule plans for data collection, requirements, design, implementation and results over 3-4 months.
Paper Design and Simulation of Communication Aid For Disabled based on region...Syambabu Choda
This document describes the design and simulation of a communication aid for disabled people based on the Telugu regional language. It discusses how sign language is an important form of communication for disabled individuals like the deaf, dumb, and blind. The proposed system uses image processing and artificial neural networks to capture Telugu sign language gestures and convert them into text that can be displayed or spoken aloud. This allows disabled individuals to communicate both with each other using regional sign language and with non-signers by translating gestures into a readable or audible format in real-time. The system aims to help overcome communication barriers faced by the disabled.
An embedded module as “virtual tongue”ijistjournal
There are several human disabilities in nature of which speech impaired people find difficulty in
communicating with others, which is very important to convey their messages without speech. In this
paper, to make them self reliable and independent, with the advent of embedded systems technology an
embedded handheld icon based assistive device as “Virtual Tongue” for Voiceless, which speaks for
severely speech disordered people by simply pressing icons appropriately to fulfill their needs, is proposed.
This proposed module comprises a microcontroller based player to play voice messages, Secure Digital
(SD) card reader, Universal Serial Bus (USB) port, icon based remote keypad, audio amplifier and speaker
along with the benefits like portable, reliable, user friendly, affordable cost, low power consumption and of
course speech in regional language with good clarity. The proposed system is designed to produce speech
regardless of time length, audible to the neighbors, based on the request from the user by pressing the icons
thereby this module deserves inarticulate people. An extended version with a feature of converting text into
voice by adding a circuit, with which any text fed through a keyboard can be converted into speech, is also
discussed.
This document provides an introduction to character recognition and optical character recognition (OCR). It discusses the purpose and history of OCR, including early technologies from the 1910s-1930s. It also covers the scope, technology used, and how to use OCR software. Finally, it discusses the feasibility study for an OCR project, including technical, operational, and economic feasibility. The overall purpose is to develop an efficient OCR software system to convert paper documents to electronic format for improved document processing and searchability.
This document summarizes a research paper that presents an optical character recognition system for isolated Arabic characters using a DSP-based hardware implementation. The system uses a fuzzy ART neural network for character recognition after extracting features from segmented characters. Testing achieved a 95% recognition rate on 700 sample characters from various fonts and sizes. The system was implemented on a TI TMS320C6416T digital signal processor for high-speed performance with small size and low power consumption. Future work could expand the system to recognize connected characters and directly read images rather than relying on pre-processing in MATLAB.
speech recognition,History of speech recognition,what is speech recognition,Voice recognition software , Advantages and Disadvantages speech recognition, voice recognition,Voice recognition in operating systems ,Types of speech recognition
Handwritten Text Recognition and Digital Text Conversionijtsrd
Sometimes it is extremely difficult to secure handwritten documents in the real world. While doing so, we may encounter many problems such as misplacing the documents, unavailability of access from anywhere, physical damage, etc. So, to keep the information secure, we convert that information into digital format to address all the above mentioned problems. The main aim of our application is to recognize hand written text and display it in digital text format. Image processing is very significant process for data analysis these days. In image processing, the visible text from the real world as input must be processed precisely in order to produce the same information as output with accuracy. To do this, the text present in the image must be recognized by the system accurately. The proposed system aims at achieving these results. The process goes in this way The image which contains the handwritten text is fed to the system is passed into neural network which recognizes the handwritten text present in the image and displays it in the form of digital text. This can be used for many purposes such as copying the digital text for using it elsewhere, producing formal documents and can also be used as input for data processing. Using this process, we can store the information in a secure way, we can access the information from anywhere or at any time and there is no scope for physical damage as the information is in digital format. Mr. B. Ravinder Reddy | J. Nandini | P. Sowmya | Y. Sathwik ""Handwritten Text Recognition and Digital Text Conversion"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23508.pdf
Paper URL: https://www.ijtsrd.com/computer-science/data-processing/23508/handwritten-text-recognition-and-digital-text-conversion/mr-b-ravinder-reddy
Voice input and speech recognition system in tourism/social mediacidroypaes
Voice input devices allow users to input data or commands using speech instead of other input methods like keyboards. Some voice input devices recognize words from a predefined vocabulary while others need to be trained for a specific speaker. When a word is spoken, the matching input is displayed on screen for verification.
Speech recognition is the process of converting spoken language to text using computer programs. It draws from linguistics, computer science, and electrical engineering. Applications include voice assistants, dictation software, call routing, and more. Accuracy depends on factors like vocabulary size, presence of similar sounding words, whether the system is designed for one speaker or many, and whether speech is isolated, connected or continuous.
The document describes a proposed voice-based email system for blind users. The system would use speech recognition to allow users to compose and send emails solely through voice commands. It would also use text-to-speech to read incoming emails aloud. The system aims to make email more accessible for blind and visually impaired users by eliminating the need to use keyboards. It could also help illiterate users. The document outlines the objectives, modules, algorithms, and technologies used in the proposed system, such as speech-to-text, text-to-speech, and interactive voice response.
Text Detection and Recognition with Speech Output for Visually Challenged Per...IJERA Editor
The document reviews existing systems that aim to assist visually impaired persons by detecting and recognizing text from images and converting it to speech. It discusses how optical character recognition and text-to-speech technologies have been used to develop applications like newspaper reading systems, signage recognition systems, and camera-based text reading systems. The document also summarizes various text detection and recognition methods that have been used, such as gradient feature-based, color segmentation-based, texture feature-based, and layout analysis-based approaches.
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...IRJET Journal
This document summarizes a research paper that proposes a system to detect text in images of traffic signs, extract the text, and translate it to English. The system uses convolutional neural networks (CNNs) to detect text areas and recurrent neural networks (RNNs) to translate the extracted text. The goal is to help travelers understand traffic signs written in unfamiliar languages like Spanish or French by automatically translating the text in images to English. The system performs three steps: 1) detect text areas in images of signs, 2) extract the words from the detected text regions, and 3) translate the extracted text to English for the user.
This document describes a project to create virtual vision glasses to help blind people. The glasses will use optical character recognition, computer vision techniques, text-to-speech, and translation to assist users with daily tasks like reading text, navigating surroundings, and understanding foreign languages. The proposed system will be built using a Raspberry Pi single board computer with a camera, and will include applications for text recognition, translation, and assistance from Google Assistant. It aims to make an affordable assistive device for the blind and help with issues like reading signs, books, and instructions in different languages.
Assistive Examination System for Visually ImpairedEditor IJCATR
This paper presents a design of voice enabled examination system which can be used by the visually challenged students.
The system uses Text-to-Speech (TTS) and Speech-to-Text (STT) technology. The text-to-speech and speech-to-text web based
academic testing software would provide an interaction for blind students to enhance their educational experiences by providing them
with a tool to give the exams. This system will aid the differently-abled to appear for online tests and enable them to come at par with
the other students. This system can also be used by students with learning disabilities or by people who wish to take the examination in
a combined auditory and visual way.
Pakistan sign language to Urdu translator using KinectCSITiaesprime
The lack of a standardized sign language, and the inability to communicate with the hearing community through sign language, are the two major issues confronting Pakistan's deaf and dumb society. In this research, we have proposed an approach to help eradicate one of the issues. Now, using the proposed framework, the deaf community can communicate with normal people. The purpose of this work is to reduce the struggles of hearing-impaired people in Pakistan. A Kinect-based Pakistan sign language (PSL) to Urdu language translator is being developed to accomplish this. The system’s dynamic sign language segment works in three phases: acquiring key points from the dataset, training a long short-term memory (LSTM) model, and making real-time predictions using sequences through openCV integrated with the Kinect device. The system’s static sign language segment works in three phases: acquiring an image-based dataset, training a model garden, and making real-time predictions using openCV integrated with the Kinect device. It also allows the hearing user to input Urdu audio to the Kinect microphone. The proposed sign language translator can detect and predict the PSL performed in front of the Kinect device and produce translations in Urdu.
IRJET- Text Reading for Visually Impaired Person using Raspberry PiIRJET Journal
1) The document presents a system to help visually impaired people read text using optical character recognition and text-to-speech conversion on a Raspberry Pi.
2) The system uses a camera mounted on glasses to take images of text, which are then processed using OCR and converted to audio using e-Speak for the user to hear.
3) It aims to allow visually impaired people to independently read text on product labels, documents and other materials by carrying a portable device that recognizes text in images and converts it to audio in real time.
A survey on Enhancements in Speech RecognitionIRJET Journal
This document discusses enhancements in speech recognition and provides an overview of the history and basic model of speech recognition. It summarizes key enhancements researchers have made to improve speech recognition, especially in noisy environments. The basic model of speech recognition involves speech input, preprocessing using techniques like MFCCs, classification models like RNNs and HMMs, and output of a transcript. Researchers are working to develop robust speech recognition that can understand speech in any environment.
IRJET - Mutecom using Tensorflow-Keras ModelIRJET Journal
This document describes a system called MuteCom that uses machine learning and computer vision to enable communication between deaf or mute people and people who do not understand sign language. The system takes hand gestures as input from a webcam and uses a TensorFlow-Keras model trained on American Sign Language databases to recognize the gestures and output the predicted meaning in text and/or speech. The goal is to bridge the communication gap by allowing deaf/mute people to express themselves easily to others through a mobile application that recognizes their hand signs.
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
Speech recognition software uses artificial intelligence techniques to transform spoken words into text. It has various applications, such as legal and medical transcription. Automatic speech recognition involves mapping acoustic speech signals to text. However, speech recognition also faces technical challenges, such as differentiating words in continuous speech and accounting for variations in accents and pronunciations. The document discusses the history and various applications of speech recognition technology.
This paper presents the development of a camera-based assistive text reading framework to help blind persons read text labels and product packaging from hand-held objects in their daily lives. Recent developments in computer vision, digital cameras, and portable computers make it feasible to assist these individuals by developing camera-based products that combine computer vision technology with other existing commercial products such optical character recognition (OCR) systems. To automatically extract the text regions from the object, we propose a artificial neural network algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized text regions are binarized for processing the algorithm and the text characters are recognized by off-the-shelf OCR (Optical Character Recognition) and other process involved . Now the binarized signals are converted to audible signal. The working principle is as follows first the respected image will be captured and then it is converted to binary signals. Now the image is diagnosed to find whether the text is present in the image. Secondly, if the text is present, then the object of interest is detected. The respected text of the image is recognized and then converted to audible signals. Thus the recognized text codes are given as speech to the user.
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...IRJET Journal
This document presents a lip reading system that uses 3D convolutional neural networks for audio-video speech recognition. The system recognizes phrases and sentences from silent video input with or without audio. It extracts lip movements from video frames using face tracking and processes audio files separately. 3D CNNs are used to extract spatial and temporal features from the audio and video streams. The system is evaluated on the MIRACLE-VC1 dataset and achieves accuracy comparable to other recent work on visual speech recognition. Future applications of this lip reading system include improved hearing aids, voice generation for those unable to speak, and biometric authentication.
This paper proposes a system to enable communication between hearing impaired individuals and others using sign language conversion. The system uses motion capture to detect hand gestures and convert them to text. Natural language processing is then used to match the text to predefined sign language datasets. For the matched dataset, a voiceover is provided. The system also allows converting voice to text and displaying corresponding sign language images. This two-way translation aims to serve as an interpreter between deaf and hearing users to facilitate basic communication.
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...IRJET Journal
This document proposes a system to translate Indian sign language to text and speech in real-time using machine learning. It aims to help communication between the hard-of-hearing and hard-of-speaking communities by eliminating the need for interpreters. The system would use a webcam or phone camera to capture sign language gestures which are then preprocessed and fed into a convolutional neural network model to recognize the signs and output the translation as text. Previous studies that used other methods like CNNs, RNNs, and HMMs to translate sign language are summarized along with their accuracies ranging from 91.5% to 99%. The proposed system architecture involves capturing gestures, preprocessing, detecting the sign, and displaying the translation for sign to text
Abstract
The technology of voice browsing is rapidly evolving these days. It is because the use of cell phones is increasing at a very high rate, as compared to connected PCs. Listening and speaking are the natural modes of communication and information gathering. As a result we are now heading towards a more voice based approach of browsing rather than operating on textual mode. The command input and the delivery of web contents are entirely in voice. A voice browser is a device: that interprets voice input and interprets voice markup languages to generate voice output. That interprets a script which specifies exactly what to verbally present to the user as well as when to present each piece of information. Benefits Voice is a very natural user interface which speeds up browsing.
IRJET - Sign Language Text to Speech Converter using Image Processing and...IRJET Journal
This document describes a sign language text-to-speech converter system using image processing and convolutional neural networks (CNNs). The system captures images of hand gestures using a camera, applies image processing techniques like thresholding and blurring, and then uses a CNN model trained on a dataset of gestures to recognize the gestures and convert them to text and speech. The system was able to accurately recognize gestures for letters and numbers with about 85% accuracy. Future work may involve expanding the dataset to include more signs and working towards word and sentence recognition.
IRJET- Review on Raspberry Pi based Assistive Communication System for Blind,...IRJET Journal
This document summarizes a research paper that aims to develop an assistive communication device for blind, deaf, and dumb people. The device uses image processing and optical character recognition to convert text images into voice for blind users. It interprets sign language movements with video and converts them to text for deaf users. The device also has GPS tracking and ultrasonic sensors to help users who are in trouble or detect obstacles. The researchers aim to integrate these features on a single low-cost device using a Raspberry Pi microcontroller to help disabled people communicate more easily.
This document summarizes a student project on OCR-based image text-to-speech conversion. It introduces the topic, describes using OCR to extract text from images which is then converted to speech. It includes an abstract, details of related works, and concludes that the proposed system allows visually impaired users to hear text extracted from images efficiently using Tesseract OCR and text-to-speech conversion.
DEVELOPMENT OF AN INTEGRATED TOOL THAT SUMMARRIZE AND PRODUCE THE SIGN LANGUA...IJITCA Journal
This paper presents an intelligent tool to summarize Arabic articles and translate the summary to Arabic
sign language based on avatar technology. This tool was designed for improving web accessibility and help
deaf people for acquiring more benefits. Of the most important problems in this task, is that the deaf people
were facing many difficulties in reading online articles because of their limited reading skills. To solve this
problem the proposed tool includes a summarizer that summaries the articles, so deaf people will read the
summary instead of reading the whole article .Also the tool use avatar technology to translate the summary
to sing language
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
1. Journal of Science and Technology (JST)
Volume 2, Issue 4, April 2017, PP 33-41
www.jst.org.in ISSN: 2456 - 5660
www.jst.org.in 33 | Page
Smart Multilingual Sign Boards
Pravin A. Dhulekar1
, Niharika Prajapati2
, Tejal A. Tribhuvan3
, Karishma S.
Godse4
1
(Asst Professor, Department of Electronics and Telecommunication Engineering, Sandip Institute of
Technology and Research Center, Sandip Foundation, Nashik, India)
2.3,4
(Student, Department of Electronics and Telecommunication Engineering, Sandip Institute of Technology
and Research Center, Sandip Foundation, Nashik, India)
Abstract : The Project is based on design & implementation of smart hybrid system for street sign boards
recognition, text and speech conversions through character extraction and symbol matching. The default
language use to pronounce signs on the street boards is English. Here we are proposing a novel method to
convert identified character or symbol into multiple languages like Hindi, Marathi, Urdu, etc. This Project is
helpful to all starting from the visually impaired, the tourists, the illiterates and all the people who travel. The
system is accomplished with the speech pronunciation in different languages and to display on screen. This
Project has a multidisciplinary approach as it belongs to the domains like computer vision, speech processing,
& Google cloud platform. Computer vision is used for character and symbol extraction from sign boards.
Speech processing is used for text to speech conversion. GCP is used for multiple language conversion of
original extracted text. Further programming is done for real time pronunciation and displaying desired output.
Keywords - Street Sign Boards Recognition, Character Extraction, GCP, Symbol Matching, Computer Vision
I. INTRODUCTION
Communication is a very important tool in a human life. It is an essential requirement in this world to
survive. We can look back in the times of ignorance when no language was developed even than communication
existed in form of sign language and other forms. To overcome the difficulties faced in communication; this
paper has been implemented to achieve a real time system with feature/character/symbol extraction, text to
speech conversion and then text conversion into different languages. It starts with capturing any image may be
color, gray scale or monochrome via any video acquisition device and that image is stored in various file
formats (such as bmp, jpeg, tiff, etc.). After it is stored, image is read by the software and feature extraction is
performed. After character recognition is done, it is displayed and speech signal is generated which can be heard
via speakers. GCP here is used as real time language translation platform. This platform help us to get the
conversion of the extracted character in different languages , which are decided according to the users for e.g.
English, Hindi, Marathi, Urdu, etc., as we use the Google cloud platform there is none of the boundary for the
language which is displayed. This complete system is very helpful for visually impaired, illiterate, tourists and
everyone who travels.
To make the cities of India a smart city, we must focus on their needs and the greatest opportunities to
improve lives, increase security on roads, accident avoidance and safe driving.
II. LITERATURE REVIEW
A recent survey says that 285 million people are estimated to be visually impaired worldwide, 39
million are blind and 246 have low vision. Out of every 100, 82 are living with blindness are aged 50 and above
20% of all such impairment‟s still cannot be prevented or cured. In today‟s scenario, the overall literacy rate for
people aged 15 and above is 86.3% which means there are still 13.7% people illiterate around the world.
Starting from the most illiterate country like South Sudan, Afghanistan to the most literate countries like
Norway, Cuba, etc., everywhere there is a requirement of local language for the illiterate or the rural people to
communicate.
Over the past years, tourism has been growing rapidly strong and an underlying contributor to the
economic recovery. Around 1.1 billion people traveled abroad in 2014 and Europe is the most visited area in the
world. In 2015, 244 million people, or 3.3% of the world's population, lived outside their country‟s origin. It is
predicted that immigration rates will continue to increase over time and every time they travel, have to face with
different languages at different places.
Hence, for all such people across the globe a common solution, a speech generation system using
language translation software is implemented which will be useful to everyone. In [3] presented a work on
converting the multilingual text into speech for visually impaired. Basic working principle used here is u law
companding. This system used full for the blind and mute people. In [4] presented a paper based on the nature
2. Journal of Science and Technology
www.jst.org.in 34 | Page
prosody generation in the text to voice for English using the phonetic integration .This system is very
complicated to record and accumulate the word but is used the less memory space. In [5] worked on a system of
converting the text in the form of speech. This paper includes the recognized text and converted proper sound
signal. OCR techniques are implemented to get the output of the system. In [6] worked on the text to voice
conversion for the application on android environment. The basic objective of
this paper is extending the offline OCR technologies to the android platform. They proposed there system to
detect the text which modified to the multi-resolution and converted into different languages using language
translator. In [7] presented the basic idea of interfacing of speech to computer. The automatic speech recognition
(ASR) played the key role of taking the speech technology to the people. The system is classified further as
feature extraction & feature recognition. The aim was to study and to explore how neural network can be
implemented to recognize a particular word speech as an alternative to the traditional methods.
In [8] presented the voice recognition system automatically by machine means an easy mode of
communication between human & machine. The area of voice processing has implied to application like text to
speech processing, speech to text processing, telephonic communication call routing etc. Here various features
extraction, technique are discussed such as relative spectral processing (RASTA), linear predictive coding
(LPC) analysis and many more. In [9] presented the idea of an innovative and real time low cost technique that
makes user comfortable to hear the contents of the images or any kind of abstracted document instead of reading
through them. It has used a combination of OCR “Optical Character Recognition and TTS text to speech
synthesizer “.This system helps the visually impaired person to interact with the machine like the computer
easily through the vocal interface.
In [10] proposed a system which is mainly designed for the blind user , which have interest in the field
of education , this people study with the help of audio recording provided by NGO‟s or the Braille Books. This
will provide them to have an audio material of their own choice of any printed material of object. The whole
system mainly includes OCR “Optical Character Recognition “ which will perform/include the various
operations like thresholding, filtering, segmentation and many more.
Figure 1: Street Sign Boards
III. PROPOSED TECHNIQUE
To make a hybrid system, the methodology used here is Optical Character Recognition and language
translation. The images will be captured via web camera, then extracting text from image to convert that text
into speech. After that, there is further process on extracted text of language translation using Google translation
software on cloud. Then the whole system will be helpful to all starting from the visually impaired, the tourists,
the illiterates and all the people who travel.
IV. ALGORITHM
Text reading system has two main parts: image to text conversion and text to voice conversion. Image into text
and then that text into speech conversion is achieved using programming on MATLAB platform.
Figure 2: Flowchart
3. Journal of Science and Technology
www.jst.org.in 35 | Page
A. Work Flow
1) Image Acquisition Device: A computer vision system processes images acquired from an electronic
camera, which is like the human vision system where the brain processes images derived from the eyes[8].The
camera or web-cam is used as the acquisition device, which captures the image symbol or sign which will
ultimately help us to get the sample image for the further process and to get the symbol for recognition. Also we
get the video from this device through which we can extract the required area of the interest and can be
processed further for various operations to get the final results.
a) Web Camera:
Figure 3: HP Laptop Web Camera
Technical Specification:
Screen Size: 15.6 inches
Screen Resolution: 1366 x 768
Frame rate up to 30 frames per second
2) Optical Character Recognition:This block is used for character extraction and symbol matching which is
included in algorithm and explained in detailed, this is the area where all the input images or videos captured by
camera or webcam are processed and then the feature extraction is done by various techniques. Here we also
perform the template matching ,if the character or the symbol is to be matched, then the operations are
performed as character/symbol to text conversion[5].
Syntax:
txt = ocr(I)
txt = ocr(I, roi)
[ __ ] = ocr( __ ,Name,Value)
Example:
businessCard = imread(„businessCard .png‟);
ocrResults = ocr(businessCard)
recognisedText = ocrResults.Text;
figure;
imshow(businessCard);
text(600, 150, recognizedText, „BackgroundColor‟, [1 1 1]);
Steps:
Step 1: Image acquisition: Different types of images like monochrome, gray scale and color, any image can
be used as input to any video acquisition like web camera whose primary operation is to sense and capture.
Step 2: Preprocessing: Not mandatory but preprocessing is done when the input image is not clear and
requires deblurring, denoising, resizing or reshaping.
Step 3: Detect MSER Regions: Maximally Stable Extremal Regions (MSER) is used as a method of blob
detection in images. Mathematically,
4. Journal of Science and Technology
www.jst.org.in 36 | Page
Let Q1,…,Qi-1,Qi,… be a series of nested extremal regions (Qi ϛ Qi+1). Extremal region Qi* is maximally
stable if and only if q(i)=|Qi+∆ Qi-∆|/|Qi| has a local minimum at i*. (Here | | denotes cardinality). ∆ ϵ S is a
parameter of the method.
The equation checks for regions that remain stable over a certain number of thresholds. If a region Qi+∆ is not
significantly larger than a region Qi-∆, region Qi is taken as a maximally stable region.
3) Text to Speech Conversion: Basically called as “speech synthesis”. In this block the output obtained from
OCR i.e. the matched image, sign or symbol is in the form of text which we have converted by symbol to text
conversion technique here it is converted into speech by using text to voice converter technique, this speech is
processed further for various
languages ,which will help the user to get the information about the total area which is of the required interest
4)Language Translation: For this, we are using Google Cloud Platform which allows us to use its application
program interface at a minimal price. We chose Google translation API to connect with the MATLAB by
creating our own console on GCP and generating an API key that we have used to send data to and get data
from the cloud itself.
5)Via-laptop: We have implement the complete system by using laptop. We use the in-build camera of the
laptop for capturing the images, and by doing preprocessing on the captured image, we got the required results
which are displayed on the screen and at the same time the result is spelled out by the speaker of laptop.
6)Speakers and Display Device: It is the laptop speakers and the screen which is used for getting the voice in the
different languages like Hindi, English, Marathi, Urdu etc. which will help the user to get the complete info in
his own language it will be able to pronounce the different language to make the user easy to get the basic info.
This info will help the visually impaired person which is ultimately the goal of the system. The display is used
for displaying the different types of language outputs, it gives better results and which is the basic need of the
system.
B. Optical Character Recognition
Figure 4: Algorithm of OCR
1) Input Image: Capturing the image may be on streets while travelling, in classroom while teaching, for
information reading while browsing is all done by any video acquisition device such as a web camera. Then the
image is read with the help of „imread‟ command. „imread‟ reads image from graphics file.
2) Pre-Processing: Pre-processing is not a mandatory step. It is used only when the image is corrupted. It
consists of a number of steps to make the raw data usable for recognizer. It is mainly used for removal of noise,
deblurring, resizing and reshaping of the image. MSER technique is also used for blob detection.
5. Journal of Science and Technology
www.jst.org.in 37 | Page
3) Feature Extraction: The objective of feature extraction is to capture the essential characteristics of the
symbols. The most precise way of obtaining a character is by its raster image. Other way is to uproot certain
features that still characterize symbols, but leaves out the unimportant attributes [5].
4) Template Matching/Symbol Matching: A simpler way of template matching uses a convolution mask
(also called as template) structured to a related feature of the search image, which we want to detect. This
technique can be easily performed on grey images or edge images. We will match the extracted images with the
collected data or the data base. Here the information is already stored with the use of programming and hence,
created the data base.
5) Character/Symbol Recognition: Character recognition deals with cropping of each line and letter,
performing correlation and writing to a file. Before performing correlation we have to load the already stored
model templates so that we can match the letters with the templates. After we got the character by character
segmentation we store the character image in a structure. For pattern recognition, SURF (Speeded Up Robust
Feature) technique is used to match scene image with the box image with a max ratio of 0.65 and a threshold of
minimum 25 interest points.
6) Character/Symbol to Speech Conversion : First, it transforms text including symbols like numbers or
abbreviations into its equivalent of written-out words. This process is often called text normalization, pre-
processing, or tokenization[9].
7) Display : Display are used to display the text when we capture image by the camera or any type of the
scanner processing it and convert that symbol and character into the text and display on the screen.
C. Pattern Recognition (Algorithm)
SURF feature extraction technique is used for pattern based recognition. In this template images are
already stored in the database called as the „box image‟ and the image which is being captured through camera
is known as the „scene image‟. First interest points are obtained from the scene image and then matching is done
with the box image with a minimum threshold of 25 points. Also the max ratio is set to 0.65 based to get the
most accurate results. Also the matched result is shown by the matching lines clearly in the figure placed
below. The red and the green color points show the interest points. Algorithm of SURF technique is
as follows:
1. Covert images to gray-scale.
2. Extract SURF interest points.
3. Obtain features.
4. Match features.
5. Get 50 strongest features.
6. Match the number of strongest features with each image.
7. Take the ratio of- number of features matched/ number of strongest features (which is 25)
Figure 5: Matching of Box & Scene image
6. Journal of Science and Technology
www.jst.org.in 38 | Page
V. MATHEMATICAL ANALYSIS
1. Standard Data Set
This are the standard data base images which shows the100% accuracy and the following table shows
the response time for the image to give the output, which is very less.
Table 1: Standard Based
Sr.
No.
Name on Sign Board Accuracy Response
Time
1. Horns Prohibited 100% 0.217834
seconds
2. No Entry 100% 0.220310
seconds
3. No Parking 100% 0.198523
seconds
4. Overtaking
Prohibited
100% 0.234994
seconds
2.Pattern Based Data Set
The pattern based recognition shows the following Accuracy and the response time. Hence it shows the good
accuracy of the pattern or symbol based sign boards on which the processing is done.
Table 2: Pattern Based
Sr.
No.
Name on Sign
Board
Accuracy Response
Time
1. Men At Work 99% 0.298642
seconds
2. School Ahead Go
Slow
97% 0.383578
seconds
3. No U Turn 96% 0.379566
seconds
4. Turn Left 98% 0.286479
seconds
3.Real Time Data Set
This is the results of the real time data images which are captured by the user and shows 93% to 95%
accuracy. Also the response time is mentioned.
Table 3: Real Time Based
Sr.
No.
Name on Sign Board Accuracy Response
Time
1. U Turn Prohibited 93% 0.400865
seconds
2. Straight Prohibited 95% 0.389267
seconds
7. Journal of Science and Technology
www.jst.org.in 39 | Page
3. Do Not Honk Your
Horn
91% 0.305811
seconds
4. Overtaking
Prohibited
95% 0.477447
seconds
VI. RESULTS AND DISSCUSSIONS
Algorithm is piloted and tested and hence the results got generated as shown below:
This is the GUI of the complete project run on MATLAB which includes standard sign board
recognition based on both character and pattern recognition. Also real time sign board recognition is performed
Figure 6: Result Image I
Case 1(a): Result shows that each and every character from the image is extracted successfully irrespective
of their fonts using OCR & also voice is generated and displayed in three different languages..
Figure 7: Result Image II Case I(a)
Case 1(b): Result shows that how pattern of the captured image is matched with the template image & also
voice is generated and displayed in three different languages.
8. Journal of Science and Technology
www.jst.org.in 40 | Page
Figure 8: Result Image III (Case I(b)
Case 2: Result shows that each and every character from the image is extracted successfully in real time
using OCR & also voice is generated in three different languages and displayed for the same on real time
platform.
Figure 9: Result Image IV (Case II)
VII. CONCLUSION
This paper is an effort to implement an innovative robust approach for character extraction and text to
voice conversion of different images using optical character recognition and text to speech synthesis technology.
A user friendly, cost effective, reliable to all and applicable in the real time system is achieved. Using this
methodology, we can read text from a document, street sign boards, web page or also an e-Book and can
generate synthesized speech through any system i.e. computer's speakers or phone‟s speaker. In developed
software, the use of computer vision has set all policies of the characters corresponding to each and every
alphabet extraction, matching, its pronunciation, and the way it is used in grammar and dictionary. Speech
processing has given robust output for various images. Other application of this system includes such as making
information browsing for people who do not have the ability to read or write. This approach can be used in part
as well. If requirement is only for text conversion then it is possible or else text to speech conversion is also
done easily. People with vision impairment or visual dyslexia or complete blindness can use this approach for
reading the documents, books and also while travelling. People with vocal impairment or complete dumb person
can utilize this approach to turn typed words into vocalization. Tourists having language barrier can also use this
approach for understanding different languages. People travelling in cars and buses can save their time using
this feature. Experiments have been performed to test the text and speech generation system and good results
have been achieved. The work is done very effectively for the symbol extraction. We got required results as
provided in the above concept, the symbol extraction is done by using the SURF algorithm. Hence got the
suitable outputs and can overcome the problem of such a sign boards which are only in symbol format. We got
the outputs of the all the 3 methods as for the standard data base 100%, for the pattern base we acquired 96% to
98% and for the real time data set 91% to 95% accuracy with less response time . The developed system is now
able to deal with any sign boards of the RTO which is the basic need of the society, In this way we also
contribute to the concept of SMART CITY by contributing towards the SMART TRANSPORTATION.
9. Journal of Science and Technology
www.jst.org.in 41 | Page
REFERENCES
[1] Preeti.Kale, S. T. Gandhe, G. M. Phade, Pravin A. Dhulekar, “Enhancement of old images and documents by Digital Image Processing
Techniques” Presented in IEEE International Conference on Communication, Information & Computing Technology (ICCICT),
Mumbai, India. 16-17 January 2015.
[2] Harshada H. Chitte, Pravin A. Dhulekar, “Human Action Recognition for Video Surveillance” Presented in International Conference
on engineering confluence, Equinox, Mumbai, Oct 2014.
[3] Suraj Mallik, Rajesh Mehra,”Speech To Text Conversion For Visually Impaired Person Using Μ Law Companding ” Publish In Iosr
Journal Of Electronics And Communication Engineering (Iosr-Jece) E-Issn: 2278-2834,P- Issn: 2278-8735.Volume 10, Issue 6, Ver. Ii
(Nov - Dec .2015).
[4] Mrs. S. D. Suryawanshi, Mrs. R. R. Itkarkar, Mr. D. T. Mane ,“ High Quality Text To Speech Synthesizer Using Phonetic Integration
”Publish In International Journal Of Advanced Research In Electronics And Communication Engineering (Ijarece) Volume 3, Issue 2,
February 2014.
[5] Pooja Chandran, Aravind S, Jisha Gopinath And Saranya S S ,” Design And Implementation Of Speech Generation System Using
Matlab ” Publish In International Journal Of Engineering And Innovative Technology (Ijeit) Volume 4, Issue 6, December 2014.
[6] Devika Sharma ,Ranju Kanwar ,” Text To Speech Conversion With Language Translator Under Android Environment ” Publish In
International Journal Of Emerging Research In Management &Technology Issn: 2278-9359(Volume-4,Issue-6).
[7] Siddhant C. Joshi, Dr. A.N.Cheeran “Matlab Based Back-Propagation Neural Network For Automatic Speech Recognition”, IJAREEIE(An
Iso 3297: 2007 Certified Organization)Vol. 3, Issue 7, July 2014.
[8] Siddhant C. Joshi, Dr. A.N.Cheeran, “A Comparative Study of Feature Extraction Techniques for Speech Recognition System” ISSN:
2319-8753, IJIRSET ,Vol. 3, Issue 12, December 2014.
[9] K.Kalaivani, R.Praveena,V.Anjalipriya,R.Srimeena, “Real Time Implementation Of Image Recognition And Text To Speech
Conversion”,IJAERT,Volume 2 Issue 6, September 2014, Issn No.: 2348 – 8190.
[10] Kalyani Mangale, Hemangi Mhaske, Priyanka Wankhade, Vivek Niwane, “Printed Text To Audio Converter Using OCR”, International
Journal Of Computer Applications (0975 – 8887) (NCETACt-2015).
[11] Pravin A.Dhuleka,Niharika Prajapati, Tejal A Tribhuvan, Karishma S. Godse.” Automatic Voice Generation System After Street Board
Identification For Visually Impaired” ICSPICC-2016,Bambori,Jalgoan,
Maharashtra.22-24 December 2016.
[12] Mark Nixon, Alberto Aguado, Feature Extraction and Image Processing, 2nd
Edition. Sunil Jadhav, P. A. Dhulekar, Dr.G.M.Phade,
“Hybrid Optical Character Recognition Method for Recognition of Text in Images”, IEEE FIFTH International Conference on
Computing of Power ,Energy & Communication” ICCPEIC -2016, April 20th and 21th – 2016