The main focus of this study is to help a handicap person to operate a computer by voice command. It can be used to operate the entire computer functions on the user’s voice commands. It makes use of the Speech Recognition technology that allows the computer system to identify and recognize words spoken by a human using a microphone. This Software will be able to recognize spoken words and enable user to interact with the computer. This interaction includes user giving commands to his computer which will then respond by performing several tasks, actions or operations depending on the commands they gave. For Example: Opening /closing a file in computer, YouTube automation using voice command, Google search using voice command, make a note using voice command, calculation by calculator using voice command etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Online interview is not a new thing but in this covid-19 situation it seems to be the only option. However, assessing the candidate on a video call may not be that effective. Having an AI based Interview Assessment System could prove to be useful, which would take input as speech and will give output as detailed analysis of that speech. While most the research work currently done focuses only on finding sentiment or personality from speech, our system aims to extract multiple information from the speech and provide a detailed analysis. The analysis would include a detailed report containing results about confidence level of the person, his/her emotional state, speed of the speech, frequently repeated words and also personality reflected by that speech. An interview panel consists of various members focusing on different aspect of the answer given by the candidate, some focus on technical correctness while, some simply want to check the communication skills of the candidate. Having an AI system giving a report on the soft skills part would reduce the work for interviewer and he/she could give complete focus on the technical correctness of the answer. This could eventually help save time and resources used by organizations for hiring process. This intention of creating this system is to assist the interview process and give analysis report based on the speech input instead a giving a verdict about selection of the candidate. Thus, this system could use not only by the interviewers but also by the candidates. The output provided would be a detailed report which could prove to be a good feedback for the students who are preparing for the interview. Having a feedback would help candidates work on their week points and thus perform better in further interviews.
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
This paper discusses the methodology for a project named “Speech Recognized Automation System
using Speaker Identification through wireless communication”. This project gives the design of Automation
system using wireless communication and speaker recognition using Matlab code. Straightforward
programming interface of Matlab makes it an ideal tool for speech analysis in project. This automation system
is useful for home appliances as well as in industry. This paper discusses the overall design of a wireless
automation system which is built and implemented. The speech recognition centers on recognition of speech
commands stored in data base of Matlab and it is matched with incoming voice command of speaker. Mel
Frequency Cepstral Coefficient (MFCC) algorithm is used to recognize the speech of speaker and to extract
features of speech. It uses low-power RF ZigBee transceiver wireless communication modules which are
relatively cheap. This automation system is intended to control lights, fans and other electrical appliances in a
home or office using speech commands like Light, Fan etc. Further, if security is not big issue then Speech
processor is used to control the appliances without speaker identification
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Online interview is not a new thing but in this covid-19 situation it seems to be the only option. However, assessing the candidate on a video call may not be that effective. Having an AI based Interview Assessment System could prove to be useful, which would take input as speech and will give output as detailed analysis of that speech. While most the research work currently done focuses only on finding sentiment or personality from speech, our system aims to extract multiple information from the speech and provide a detailed analysis. The analysis would include a detailed report containing results about confidence level of the person, his/her emotional state, speed of the speech, frequently repeated words and also personality reflected by that speech. An interview panel consists of various members focusing on different aspect of the answer given by the candidate, some focus on technical correctness while, some simply want to check the communication skills of the candidate. Having an AI system giving a report on the soft skills part would reduce the work for interviewer and he/she could give complete focus on the technical correctness of the answer. This could eventually help save time and resources used by organizations for hiring process. This intention of creating this system is to assist the interview process and give analysis report based on the speech input instead a giving a verdict about selection of the candidate. Thus, this system could use not only by the interviewers but also by the candidates. The output provided would be a detailed report which could prove to be a good feedback for the students who are preparing for the interview. Having a feedback would help candidates work on their week points and thus perform better in further interviews.
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
This paper discusses the methodology for a project named “Speech Recognized Automation System
using Speaker Identification through wireless communication”. This project gives the design of Automation
system using wireless communication and speaker recognition using Matlab code. Straightforward
programming interface of Matlab makes it an ideal tool for speech analysis in project. This automation system
is useful for home appliances as well as in industry. This paper discusses the overall design of a wireless
automation system which is built and implemented. The speech recognition centers on recognition of speech
commands stored in data base of Matlab and it is matched with incoming voice command of speaker. Mel
Frequency Cepstral Coefficient (MFCC) algorithm is used to recognize the speech of speaker and to extract
features of speech. It uses low-power RF ZigBee transceiver wireless communication modules which are
relatively cheap. This automation system is intended to control lights, fans and other electrical appliances in a
home or office using speech commands like Light, Fan etc. Further, if security is not big issue then Speech
processor is used to control the appliances without speaker identification
Constructed model for micro-content recognition in lip reading based deep lea...journalBEEI
Communication between human beings has several ways, one of the most known and used is speech, both visual and acoustic perceptions sensory are involved, because of that, the speech is considered as a multi-sensory process. Micro contents are a small pieces of information that can be used to boost the learning process. Deep learning is an approach that dives into deep texture layers to learn fine grained details. The convolution neural network (CNN) is a deep learning technique that can be employed as a complementary model with micro learning to hold micro contents to achieve special process. In This paper a proposed model for lip reading system is presented with proposed video dataset. The proposed model receives micro contents (the English alphabet) in video as input and recognize them, the role of CNN deep learning is clearly appeared to perform two tasks, the first one is feature extraction and the second one is the recognition process. The implementation results show an efficient accuracy recognition rate for various video dataset that contains variety lip reader for many persons with age range from 11 to 63 years old, the proposed model gives high recognition rate reach to 98%.
Chapter 10: Universal design
from
Dix, Finlay, Abowd and Beale (2004).
Human-Computer Interaction, third edition.
Prentice Hall. ISBN 0-13-239864-8.
http://www.hcibook.com/e3/
The role of speech technology in biometrics, forensics and man-machine interfaceIJECEIAES
Day by day Optimism is growing that in the near future our society will witness the Man-Machine Interface (MMI) using voice technology. Computer manufacturers are building voice recognition sub-systems in their new product lines. Although, speech technology based MMI technique is widely used before, needs to gather and apply the deep knowledge of spoken language and performance during the electronic machine-based interaction. Biometric recognition refers to a system that is able to identify individuals based on their own behavior and biological characteristics. Fingerprint success in forensic science and law enforcement applications with growing concerns relating to border control, banking access fraud, machine access control and IT security, there has been great interest in the use of fingerprints and other biological symptoms for the automatic recognition. It is not surprising to see that the application of biometric systems is playing an important role in all areas of our society. Biometric applications include access to smartphone security, mobile payment, the international border, national citizen register and reserve facilities. The use of MMI by speech technology, which includes automated speech/speaker recognition and natural language processing, has the significant impact on all existing businesses based on personal computer applications. With the help of powerful and affordable microprocessors and artificial intelligence algorithms, the human being can talk to the machine to drive and control all computer-based applications. Today's applications show a small preview of a rich future for MMI based on voice technology, which will ultimately replace the keyboard and mouse with the microphone for easy access and make the machine more intelligent.
USER AUTHENTICATION USING NATIVE LANGUAGE PASSWORDSIJNSA Journal
Information security is necessary for any organization. Intrusion prevention is the basic level of security which requires user authentication. User can be authenticated to a machine by passwords. Traditional textual passwords are vulnerable to many attacks. Graphical passwords are introduced as alternatives to textual passwords to overcome these problems. This paper introduces native language passwords for authentication. Native language character set consists of characters with single or multiple strokes. User can select one (or more) character(s) for his password. The shape and strokes of the characters are used for authentication.
We propose a model for carrying out deep learning based multimodal sentiment analysis. The MOUD dataset is taken for experimentation purposes. We developed two parallel text based and audio basedmodels and further, fused these heterogeneous feature maps taken from intermediate layers to complete thearchitecture. Performance measures–Accuracy, precision, recall and F1-score–are observed to outperformthe existing models.
Speaker independent visual lip activity detection for human - computer inte...eSAT Journals
Abstract Recently there is an increased interest in using the visual features for improved speech processing. Lip reading plays a vital role in visual speech processing. In this paper, a new approach for lip reading is presented. Visual speech recognition is applied in mobile phone applications, human-computer interaction and also to recognize the spoken words of hearing impaired persons. The visual speech video is taken as input for face detection module which is used to detect the face region. The mouth region is identified based on the face region of interest (ROI). The mouth images are applied for feature extraction process. The features are extracted using every 10th coordinate, every 16th coordinate, 16 point + Discrete Cosine Transform (DCT) method and Lip DCT method. Then, these features are applied as inputs for recognizing the visual speech using Hidden Markov Model. Out of the different feature extraction methods, the DCT method gives the experimental results of better performance accuracy. 10 participants were uttered 35 different isolated words. For each word, 20 samples are collected for training and testing the process. Index Terms: Feature Extraction, HMM, Mouth ROI, DWT, Visual Speech Recognition
Forensic and Automatic Speaker Recognition System IJECEIAES
Current Automatic Speaker Recognition (ASR) System has emerged as an important medium of confirmation of identity in many businesses, ecommerce applications, forensics and law enforcement as well. Specialists trained in criminological recognition can play out this undertaking far superior by looking at an arrangement of acoustic, prosodic, and semantic attributes which has been referred to as structured listening. An algorithmbased system has been developed in the recognition of forensic speakers by physics scientists and forensic linguists to reduce the probability of a contextual bias or pre-centric understanding of a reference model with the validity of an unknown audio s ample and any suspicious individual. Many researchers are continuing to develop automatic algorithms in signal processing and machine learning so that improving performance can effectively introduce the speaker’s identity, where the automatic system performs equally with the human audience. In this paper, I examine the literature about the identification of speakers by machines and humans, emphasizing the key technical speaker pattern emerging for the automatic technology in the last decade. I focus on many aspects of automatic speaker recognition (ASR) systems, including speaker-specific features, speaker models, standard assessment data sets, and performance metrics.
Eat it, Review it: A New Approach for Review Predictionvivatechijri
Deep Learning has achieved significant improvement in various machine learning tasks. Nowadays,
Recurrent Neural Network (RNN) and Long Short Term Memory (LSTM) have been increasing its popularity on
Text Sequence i.e. word prediction. The ability to abstract information from image or text is being widely
adopted by organizations around the world. A basic task in deep learning is classification be it image or text.
Current trending techniques such as RNN, CNN has proven that such techniques open the door for data analysis.
Emerging technologies such has Region CNN, Recurrent CNN have been under consideration for the analysis.
Recurrent CNN is being under development with the current world. The proposed system uses Recurrent Neural
Network for review prediction. Also LSTM is used along with RNN so as to predict long sentences. This system
focuses on context based review prediction and will provide full length sentence. This will help to write a proper
reviews by understanding the context of user.
Software is generally a set of instructions to instruct the computer.
Hardware is referenced as the body of instruments or devices.
Firmware is generally a type of software used to control hardware devices.
Assistive Examination System for Visually ImpairedEditor IJCATR
This paper presents a design of voice enabled examination system which can be used by the visually challenged students.
The system uses Text-to-Speech (TTS) and Speech-to-Text (STT) technology. The text-to-speech and speech-to-text web based
academic testing software would provide an interaction for blind students to enhance their educational experiences by providing them
with a tool to give the exams. This system will aid the differently-abled to appear for online tests and enable them to come at par with
the other students. This system can also be used by students with learning disabilities or by people who wish to take the examination in
a combined auditory and visual way.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Abstract
In this paper we propose a new product inwhich speech is used to interact with computers. Speech is a man’s most powerful form
of communication. The user will be able to give various voice commands to the system, which the system will recognize and
execute tasks based on the input command. This system will provide another form of input (apart from mouse and keyboard) for
daily users. It will also be of great assistance to physically challenged users. The user will be able to perform all the operations
using his voice as input which he is able to perform normally using mouse and keyboard. The system however requires a little bit
training from the user, so that the system will understand the user better. The need of training is due to the fact that every person
has different voice. Also the voice of women is totally distinct from men. More training will result in faster and accurate response.
Extensive experiments are conducted in order to check the accuracy and efficiency of the system, and the results show that our
system is reliable and achieves better accuracy and efficiency than previous systems.
KeyWords: Speech Technology, Voice Response System, Voice User Interface, Voice Recognition.
Constructed model for micro-content recognition in lip reading based deep lea...journalBEEI
Communication between human beings has several ways, one of the most known and used is speech, both visual and acoustic perceptions sensory are involved, because of that, the speech is considered as a multi-sensory process. Micro contents are a small pieces of information that can be used to boost the learning process. Deep learning is an approach that dives into deep texture layers to learn fine grained details. The convolution neural network (CNN) is a deep learning technique that can be employed as a complementary model with micro learning to hold micro contents to achieve special process. In This paper a proposed model for lip reading system is presented with proposed video dataset. The proposed model receives micro contents (the English alphabet) in video as input and recognize them, the role of CNN deep learning is clearly appeared to perform two tasks, the first one is feature extraction and the second one is the recognition process. The implementation results show an efficient accuracy recognition rate for various video dataset that contains variety lip reader for many persons with age range from 11 to 63 years old, the proposed model gives high recognition rate reach to 98%.
Chapter 10: Universal design
from
Dix, Finlay, Abowd and Beale (2004).
Human-Computer Interaction, third edition.
Prentice Hall. ISBN 0-13-239864-8.
http://www.hcibook.com/e3/
The role of speech technology in biometrics, forensics and man-machine interfaceIJECEIAES
Day by day Optimism is growing that in the near future our society will witness the Man-Machine Interface (MMI) using voice technology. Computer manufacturers are building voice recognition sub-systems in their new product lines. Although, speech technology based MMI technique is widely used before, needs to gather and apply the deep knowledge of spoken language and performance during the electronic machine-based interaction. Biometric recognition refers to a system that is able to identify individuals based on their own behavior and biological characteristics. Fingerprint success in forensic science and law enforcement applications with growing concerns relating to border control, banking access fraud, machine access control and IT security, there has been great interest in the use of fingerprints and other biological symptoms for the automatic recognition. It is not surprising to see that the application of biometric systems is playing an important role in all areas of our society. Biometric applications include access to smartphone security, mobile payment, the international border, national citizen register and reserve facilities. The use of MMI by speech technology, which includes automated speech/speaker recognition and natural language processing, has the significant impact on all existing businesses based on personal computer applications. With the help of powerful and affordable microprocessors and artificial intelligence algorithms, the human being can talk to the machine to drive and control all computer-based applications. Today's applications show a small preview of a rich future for MMI based on voice technology, which will ultimately replace the keyboard and mouse with the microphone for easy access and make the machine more intelligent.
USER AUTHENTICATION USING NATIVE LANGUAGE PASSWORDSIJNSA Journal
Information security is necessary for any organization. Intrusion prevention is the basic level of security which requires user authentication. User can be authenticated to a machine by passwords. Traditional textual passwords are vulnerable to many attacks. Graphical passwords are introduced as alternatives to textual passwords to overcome these problems. This paper introduces native language passwords for authentication. Native language character set consists of characters with single or multiple strokes. User can select one (or more) character(s) for his password. The shape and strokes of the characters are used for authentication.
We propose a model for carrying out deep learning based multimodal sentiment analysis. The MOUD dataset is taken for experimentation purposes. We developed two parallel text based and audio basedmodels and further, fused these heterogeneous feature maps taken from intermediate layers to complete thearchitecture. Performance measures–Accuracy, precision, recall and F1-score–are observed to outperformthe existing models.
Speaker independent visual lip activity detection for human - computer inte...eSAT Journals
Abstract Recently there is an increased interest in using the visual features for improved speech processing. Lip reading plays a vital role in visual speech processing. In this paper, a new approach for lip reading is presented. Visual speech recognition is applied in mobile phone applications, human-computer interaction and also to recognize the spoken words of hearing impaired persons. The visual speech video is taken as input for face detection module which is used to detect the face region. The mouth region is identified based on the face region of interest (ROI). The mouth images are applied for feature extraction process. The features are extracted using every 10th coordinate, every 16th coordinate, 16 point + Discrete Cosine Transform (DCT) method and Lip DCT method. Then, these features are applied as inputs for recognizing the visual speech using Hidden Markov Model. Out of the different feature extraction methods, the DCT method gives the experimental results of better performance accuracy. 10 participants were uttered 35 different isolated words. For each word, 20 samples are collected for training and testing the process. Index Terms: Feature Extraction, HMM, Mouth ROI, DWT, Visual Speech Recognition
Forensic and Automatic Speaker Recognition System IJECEIAES
Current Automatic Speaker Recognition (ASR) System has emerged as an important medium of confirmation of identity in many businesses, ecommerce applications, forensics and law enforcement as well. Specialists trained in criminological recognition can play out this undertaking far superior by looking at an arrangement of acoustic, prosodic, and semantic attributes which has been referred to as structured listening. An algorithmbased system has been developed in the recognition of forensic speakers by physics scientists and forensic linguists to reduce the probability of a contextual bias or pre-centric understanding of a reference model with the validity of an unknown audio s ample and any suspicious individual. Many researchers are continuing to develop automatic algorithms in signal processing and machine learning so that improving performance can effectively introduce the speaker’s identity, where the automatic system performs equally with the human audience. In this paper, I examine the literature about the identification of speakers by machines and humans, emphasizing the key technical speaker pattern emerging for the automatic technology in the last decade. I focus on many aspects of automatic speaker recognition (ASR) systems, including speaker-specific features, speaker models, standard assessment data sets, and performance metrics.
Eat it, Review it: A New Approach for Review Predictionvivatechijri
Deep Learning has achieved significant improvement in various machine learning tasks. Nowadays,
Recurrent Neural Network (RNN) and Long Short Term Memory (LSTM) have been increasing its popularity on
Text Sequence i.e. word prediction. The ability to abstract information from image or text is being widely
adopted by organizations around the world. A basic task in deep learning is classification be it image or text.
Current trending techniques such as RNN, CNN has proven that such techniques open the door for data analysis.
Emerging technologies such has Region CNN, Recurrent CNN have been under consideration for the analysis.
Recurrent CNN is being under development with the current world. The proposed system uses Recurrent Neural
Network for review prediction. Also LSTM is used along with RNN so as to predict long sentences. This system
focuses on context based review prediction and will provide full length sentence. This will help to write a proper
reviews by understanding the context of user.
Software is generally a set of instructions to instruct the computer.
Hardware is referenced as the body of instruments or devices.
Firmware is generally a type of software used to control hardware devices.
Assistive Examination System for Visually ImpairedEditor IJCATR
This paper presents a design of voice enabled examination system which can be used by the visually challenged students.
The system uses Text-to-Speech (TTS) and Speech-to-Text (STT) technology. The text-to-speech and speech-to-text web based
academic testing software would provide an interaction for blind students to enhance their educational experiences by providing them
with a tool to give the exams. This system will aid the differently-abled to appear for online tests and enable them to come at par with
the other students. This system can also be used by students with learning disabilities or by people who wish to take the examination in
a combined auditory and visual way.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Abstract
In this paper we propose a new product inwhich speech is used to interact with computers. Speech is a man’s most powerful form
of communication. The user will be able to give various voice commands to the system, which the system will recognize and
execute tasks based on the input command. This system will provide another form of input (apart from mouse and keyboard) for
daily users. It will also be of great assistance to physically challenged users. The user will be able to perform all the operations
using his voice as input which he is able to perform normally using mouse and keyboard. The system however requires a little bit
training from the user, so that the system will understand the user better. The need of training is due to the fact that every person
has different voice. Also the voice of women is totally distinct from men. More training will result in faster and accurate response.
Extensive experiments are conducted in order to check the accuracy and efficiency of the system, and the results show that our
system is reliable and achieves better accuracy and efficiency than previous systems.
KeyWords: Speech Technology, Voice Response System, Voice User Interface, Voice Recognition.
Control mouse and computer system using voice commandseSAT Journals
and Computer System Using Voice Commands”. It can be used to
operate the entire computer functions on the user’s voice commands. It makes use of the Speech Recognition technology that
allows the computer system to identify and recognize words spoken by a human using a microphone. This Software will be able to
recognize your spoken words and enable user to interact with his computer. This interaction includes user giving commands to his
computer which will then respond by performing several tasks, actions or operations depending on the commands you gave. It can
also help user to dictate and computer would convert their spoken words into text for word processing application or E-mail. For
Example: Opening /closing a file, starting an application, Mouse click or movement.
Keywords: Speech Technology, Voice Response System, Voice Commands, Human-Computer Verbal Interaction
Chat Application using Java which is based on Socket Programming java , there is Software managed (SEPM) file ppt based for gudence on project using life cycle of project ,like Feasibility study and steps of Project life cycle that how 1 software faces the phases of development . socket based programming in java ,based on client server technology .
A Survey on Speech Recognition with Language Specificationijtsrd
As a cross disciplinary, speech recognition is entirely based on the speech as the survey object. Speech recognition allows the machine to convert the speech signal into text or commands via the process of identification and understanding. Speech recognition involves in various fields of physiology, psychology, linguistics, computer science and signal processing, and is even related to the person’s body language, and its goal is to achieve natural language communication between man and machine. The speech recognition technology is gradually becoming the key technology of the IT man machine interface. This paper describes the development of speech recognition technology and its basic principles, methods, reviewed the classification of speech recognition systems, speech recognition approaches and voice recognition technology, analyzed the problems faced by the speech recognition. Dr. Preeti Savant | Lakshmi Sandhya H "A Survey on Speech Recognition with Language Specification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-3 , April 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49370.pdf Paper URL: https://www.ijtsrd.com/computer-science/speech-recognition/49370/a-survey-on-speech-recognition-with-language-specification/dr-preeti-savant
Advanced Computational Intelligence: An International Journal (ACII)aciijournal
The purpose of this research paper is to illustrate the implementation of a Voice Command System. This
system works on the primary input of a user’s voice. Using voice as an input, we were able to convert it to
text using a speech to text engine. The text hence produced was used for query processing and fetching
relevant information. When the information was fetched, it was then converted to speech using speech to
text conversion and the relevant output to the user was given. Additionally, some extra modules were also
implemented which worked on the concept of keyword matching. These included telling time, weather and
notification from social applications.
VOICE COMMAND SYSTEM USING RASPBERRY PIaciijournal
The purpose of this research paper is to illustrate the implementation of a Voice Command System. This
system works on the primary input of a user’s voice. Using voice as an input, we were able to convert it to
text using a speech to text engine. The text hence produced was used for query processing and fetching
relevant information. When the information was fetched, it was then converted to speech using speech to
text conversion and the relevant output to the user was given. Additionally, some extra modules were also
implemented which worked on the concept of keyword matching. These included telling time, weather and
notification from social applications.
Similar to Developing a hands-free interface to operate a Computer using voice command (20)
Harnessing the power of wind: a comprehensive analysis of wind energy potenti...Mohammad Liton Hossain
This study gives a thorough analysis on the wind energy potential in Dhaka, Bangladesh, utilizing data from NASA
Power’s remote sensing tools and weather data from the Bangladesh Meteorological Department (BMD). The wind
speed data collected over a 22 year period at an altitude of 10 m. The results indicate that 3.07 m per second (ms−1)
is Dhaka city’s typical wind speed, while the maximum wind speeds were recorded in June and July. A Weibull
distribution is used to observe the wind data, as well as to calculate the Weibull form parameter of 2.65 and the scale
parameter of 3.43 ms−1. Based on these parameters, the most probable wind speed along with the wind speed
carrying maximum energy were calculated 2.83 ms−1 and 4.28 ms−1, respectively. The highest density of energy
has been found in the month of July with a value of 52.11 W/m2. According to the study, the south is the most
prominent wind direction for Dhaka city. Moreover, the study analyzes the relations between energy density
and other variables, like wind speed, humidity, dry bulb temperature, etc. Positive correlations between energy
density, wind speed, and dry bulb temperature imply that the higher wind speeds and dry bulb temperatures result
in greater energies. The study’s conclusions offer intuitive information about Dhaka City’s potential for wind energy
and can support direct future efforts to pursue this green resource in alignment with the Sustainable Development
Goals (SDGs) of Bangladesh
The main focus of this study is to find appropriate and stable solutions for representing the statistical data into map with some special features. This research also includes the comparison between different solutions for specific features. In this research I have found three solutions using three different technologies namely Oracle MapViewer, QGIS and AnyMap which are different solutions with different specialties. Each solution has its own specialty so we can choose any solution for representing the statistical data into maps depending on our criteria’s.
Development of an Audio Transmission System Through an Indoor Visible Light ...Mohammad Liton Hossain
This study presents an approach to develop an indoor visible light communication system capable of transmitting audio signal over light beam within a short distance. Visible Light Communication (VLC) is a pretty new technology which used light sources to transmit data for communication. In any communication system, both analog and digital signal transmission are possible, though, due to having the capability of providing a faithful quality of signal regeneration after the transmission process, digital communication system is much more popular than the analog one. In the current project, digital communication process was adopted also. To convert the analog audio signal into the digital transmission signal and vice versa, Pulse Width Modulation (PWM) was used as the signal encoding strategy. As the light emitter, white Light Emitting Diodes (LEDs) were used and as photo sensor, a solar cell was used instead of a photodiode to obtain greater signal power and sensitivity. In the system, the carrier signal for transmission was chosen to have a frequency of 50 KHz. At the receiving end, a 4th order Butterworth lowpass filter having a cutoff frequency of 8 KHz was used to demodulate the audio signal. Using only 2 white LEDs, the indoor transmission range of this visible light communication system was found to be 5 meters while reproducing a satisfactory quality audio.
Development of a Low Power Indoor Transmission System with a Dedicated Androi...Mohammad Liton Hossain
This Study demonstrates the design, development and the implementation of a low power, portable Indoor Transmission (campus radio) system using Raspberry Pi which facilitates larger scale implementation at moderate cost. It can be locally or remotely controlled and configured for both education and research purposes. This concept may be extended to implement in large college campuses or in any university by some parameter modifications where the latest happenings in an institute can be informed to the students by tuning to the pre-assigned frequency of an FM receiver system. For smart handling, a dedicated Android app is also developed here.
DESIGN AND IMPLEMENTATION A BPSK MODEM AND BER MEASUREMENT IN AWGN CHANNELMohammad Liton Hossain
Modems, in the beginning, were used mainly to communicate between data terminals and a host computer. Later, the use of modems was extended to communicate between end computers. This required more speed and the data rates increased from 300 bps in early days to 28.8bps today. Today, transmission involves data compression techniques which increase the rates, error detection and error correction for more reliability.
This research includes the design, implementation and simulation of a transmitter, a receiver of a BPSK based system. We implement a BPSK modem and BER measurement with an AWGN channel. We detect and count Error by this modem. With increasing SNR we reduce BER by the BPSK Modem.
DEVELOPMENT OF AN ALPHABETIC CHARACTER RECOGNITION SYSTEM USING MATLAB FOR BA...Mohammad Liton Hossain
Character recognition technique, associates a symbolic identity with the image of the character, is an important area in pattern recognition and image processing. The principal idea here is to convert raw images (scanned from document, typed, pictured etcetera) into editable text like html, doc, txt or other formats. There is a very limited number of Bangla Character recognition system, if available they can’t recognize the whole alphabet set. Motivated by this, this paper demonstrates a MATLAB based Character Recognition system from printed Bangla writings. It can also compare the characters of one image file to another one. Processing steps here involved binarization, noise removal and segmentation in various levels, features extraction and recognition.
The main focus of this study is to find appropriate and stable solutions for representing the statistical data into map with some special features. This research also includes the comparison between different solutions for specific features. In this research I have found three solutions using three different technologies namely Oracle MapViewer, QGIS and AnyMap which are different solutions with different specialties. Each solution has its own specialty so we can choose any solution for representing the statistical data into maps depending on our criteria’s.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Developing a hands-free interface to operate a Computer using voice command
1. International Journal of Scientific and Research Publications, Volume 11, Issue 3, March 2021 468
ISSN 2250-3153
This publication is licensed under Creative Commons Attribution CC BY.
http://dx.doi.org/10.29322/IJSRP.11.03.2021.p11166 www.ijsrp.org
Developing a hands-free interface to operate a Computer
using voice command
Mohammad Liton Hossain*
, Md. Noushad Rana**
*
Department of ECE, Institute of Science and Technology
**
Department of CSE, Institute of Science and Technology
DOI: 10.29322/IJSRP.11.03.2021.p11166
http://dx.doi.org/10.29322/IJSRP.11.03.2021.p11166
Abstract- The main focus of this study is to help a handicap
person to operate a computer by voice command. It can be used
to operate the entire computer functions on the user’s voice
commands. It makes use of the Speech Recognition technology
that allows the computer system to identify and recognize words
spoken by a human using a microphone. This Software will be
able to recognize spoken words and enable user to interact with
the computer. This interaction includes user giving commands to
his computer which will then respond by performing several
tasks, actions or operations depending on the commands they
gave. For Example: Opening /closing a file in computer,
YouTube automation using voice command, Google search using
voice command, make a note using voice command, calculation
by calculator using voice command etc.
Index Terms- Google search, Human-Computer Verbal
Interaction, Speech Recognition Technology, Speech
Technology, Voice Commands, VAD, Voice Response System.
I. INTRODUCTION
This is study focused on controlling the computer by voice
command. It is useful for handicap persons who have no ability
to operate a computer; they can operate their computer by voice
command.
They can browse any website by voice command. This system
has included Wikipedia library, so they can collect any
information from Wikipedia. They can access files with voice
command as like normal persons access file by mouse.
Moreover, the system is designed for the handicap persons to
carry out operations in a smooth and effective manner. The
application is reduced as much as possible to avoid errors while
entering the voice command data. It also provides error message
while it can’t catch voice command. No formal knowledge is
needed for the users to use this system. Thus, by this all it proves
it is user friendly. This system as described above, can lead to
secure and reliable system.
The system will do the following activities:
Greeting message
Get voice command
Voice recognition
Converting speech to text
Checking stored data
Reporting
Take action
II. LITERATURE REVIEW
Human-computer interfaces facilitate communication, assist on
the exchange of information, process commands and controls and
perform several additional interactions. Spoken natural language
is more user-friendly mean of interacting with a computer. From
the human perspective point, this kind of interaction is easier
since it does not urge humans to learn additional interactions.
Humans can rely on natural ways of communications instead.
Human-computer interaction varies from understanding simple
commands to extracting all the information in the speech signal
such as words, meanings and emotions of the user. To develop an
interface with natural language understanding ability, several
factors arise and must be taken into account such as dealing with
the ungrammatical nature of many spoken utterances, the
detection of problems in speech recognition, and the design of
intelligent clarification dialogues.
Speech recognition systems can be divided into a number of
classes based on their ability to recognize different words. A few
classes of speech recognition are classified as under:
Isolated Speech
Isolated words usually involve a pause between two
utterances; it doesn’t mean that, it only accepts a single
word, but requires one utterance at a time.
Connected Speech
Connected words or connected speech is similar to
isolated speech, but allows separate utterances with
minimal pauses between them.
Continuous Speech
Continuous speech allows the user to speak almost
naturally, and is also called computer dictation.
A. How Speech Recognition Works
Early systems were limited to a single speaker and had limited
vocabularies of about a dozen words. Modern speech recognition
systems have come a long way since their ancient counterparts.
They can recognize speech from multiple speakers and have
enormous vocabularies in numerous languages.
2. International Journal of Scientific and Research Publications, Volume 11, Issue 3, March 2021 469
ISSN 2250-3153
This publication is licensed under Creative Commons Attribution CC BY.
http://dx.doi.org/10.29322/IJSRP.11.03.2021.p11166 www.ijsrp.org
The first component of speech recognition is, of course, speech.
Speech must be converted from physical sound to an electrical
signal with a microphone, and then to digital data with an analog-
to-digital converter. Once digitized, several models can be used
to transcribe the audio to text.
Most modern speech recognition systems rely on what is known
as a Hidden Markov Model (HMM). This approach works on the
assumption that a speech signal, when viewed on a short enough
timescale (say, ten milliseconds), can be reasonably
approximated as a stationary process—that is, a process in which
statistical properties do not change over time.
In a typical HMM, the speech signal is divided into 10-
millisecond fragments. The power spectrum of each fragment,
which is essentially a plot of the signal’s power as a function of
frequency, is mapped to a vector of real numbers known
as spectral coefficients. The dimension of this vector is usually
small—sometimes as low as 10, although more accurate systems
may have dimension 32 or more. The final output of the HMM is
a sequence of these vectors
To decode the speech into text, groups of vectors are matched to
one or more phonemes—a fundamental unit of speech. This
calculation requires training, since the sound of a phoneme varies
from speaker to speaker, and even varies from one utterance to
another by the same speaker. A special algorithm is then applied
to determine the most likely word (or words) that produce the
given sequence of phonemes.
One can imagine that this whole process may be computationally
expensive. In many modern speech recognition systems, neural
networks are used to simplify the speech signal using techniques
for feature transformation and dimensionality
reduction before HMM recognition. Voice activity detectors
(VADs) are also used to reduce an audio signal to only the
portions that are likely to contain speech. This prevents the
recognizer from wasting time analyzing unnecessary parts of the
signal.
III. SYSTEM ANALYSIS AND FEASIBILTY STUDY
A. System Analysis
System analysis is the process of gathering and interpreting facts,
diagnosing problems and using the information to recommend
improvements on the system. System analysis is a problem-
solving activity that requires intensive communication between
the system users and system developers. System analysis or study
is an important phase of any system development process. The
system is viewed as a whole, the inputs are identified and the
system is subjected to close study to identify the problem areas.
The solutions are given as a proposal. The proposal is reviewed
on user request and suitable changes are made. This loop ends as
soon as the user is satisfied with the proposal.
B. EXISTING SYSTEM
The current voice command system or assistant for computer are:
costly
It is less user-friendly.
Not compatible for handicap person
Single instruction module
Voice command is complex
C. PROPOSED SYSTEM
The aim of this study is to develop a voice command
system that will help to operate a computer. This system is
revolutionary to voice control computer system for handicap
persons. This system can be used for all type of operation on
computer by voice command.
The system is to eliminate the complexity of operate a
computer
It is user-friendly
Its voice recognition technology is almost 90% accurate
Easy to use
This system is best for handicap persons
D. Feasibility study
Feasibility study of the system is a very important stage during
system design. Feasibility study is a test of a system proposal
according to its work ability impact on the organization, ability
to meet user needs, and effective use of resources. Feasibility
study decides whether the system is properly developed or not.
There are five types of feasibility as mentioned below which
have been analyzed in this study:
Technical Feasibility
Time Schedule feasibility
Operational feasibility
Implementation feasibility
Economic Feasibility
Considering all the feasibility steps this proposed system meets
all the feasibility requirements.
IV. PROJECT DESIGN
Software design is the stage in the software engineering process
at which an executable software system is developed. For some
simple systems, software design is software engineering, and all
other activities are merged with this process. However, for large
systems, software design is only one of a set of processes
(requirements engineering, verification and validation, etc.)
involved in software engineering. Software design is a creative
activity in which you identify software components and their
relationships, based on a customer's requirements. A design is in
the programmer's head or roughly sketched on a whiteboard or
sheets of paper. Design is about how to solve a problem, so there
3. International Journal of Scientific and Research Publications, Volume 11, Issue 3, March 2021 470
ISSN 2250-3153
This publication is licensed under Creative Commons Attribution CC BY.
http://dx.doi.org/10.29322/IJSRP.11.03.2021.p11166 www.ijsrp.org
is always a design process. In design part this study focused on
design the data model and software model of a system.
A. Use Case Diagram of proposed system:
Figure 1: Use case diagram of proposed system
B. Flowchart of proposed system:
Figure 2: Flowchart of proposed system
C. Data Flow Diagram (DFD):
The DFD (Data Flow Diagram) is known as Bubble chart. It is a
simple graphical notation that can be used to represent a system
in term of the data to the system. Various processing carried out
on this data, and the output data generated by system.
0-Level DFD:
Figure 3: 0-Level DFD
4. International Journal of Scientific and Research Publications, Volume 11, Issue 3, March 2021 471
ISSN 2250-3153
This publication is licensed under Creative Commons Attribution CC BY.
http://dx.doi.org/10.29322/IJSRP.11.03.2021.p11166 www.ijsrp.org
1-Level DFD
Figure 4: 1-Level DFD
2-Level DFD:
Figure 5: 2-Level DFD
V. IMPLEMENTATION
Software engineering includes all of the activities involved in
software development from the initial requirements of the system
through to maintenance and management of the deployed system.
A critical stage of this process is, of course, system
implementation, where this study proposed an executable version
of the software. Implementation may involve developing
programs in high-level or low-level programming languages or
tailoring and adapting generic, off-the-shelf systems to meet the
specific requirements of an organization.
A. Software installation
This project used pyCharm IDE, Which is for python
programming.
Python installation
pyCharm installation
B. Dependencies installation
pyttsx3
SpeechRecognition
pyautogui
Selenium
wikipedia
keyboard
C. Code Implementation
VI. TESTING
Software testing is an activity to check whether the actual
result match the expected result and to ensure that the
software system is defect free. It involves execution of a
software component or system component to evaluate one
or more properties of interest. Software testing also helps
to identify errors, gaps or missing requirements in contrary
to the actual requirements. It can be either done manually
or using automated tools. Testing can be both manual and
also automation.
A. Computer Folder Traversing
Input Voice Command: “Open My Computer”
5. International Journal of Scientific and Research Publications, Volume 11, Issue 3, March 2021 472
ISSN 2250-3153
This publication is licensed under Creative Commons Attribution CC BY.
http://dx.doi.org/10.29322/IJSRP.11.03.2021.p11166 www.ijsrp.org
Input Voice Command: “Go Down”
Figure 7: Folder Traversing (Go Down)
Input Voice Command: “Press Enter”
Figure 8: Folder Traversing (Press Enter)
B. Making a Note:
Input voice command “make a note”
Output voice “do you want to make a new
note! Or open a existing note”
Input voice command “make a new note”
Output voice “give a name”
Input voice command “personal info”
Input voice command “what is your name”
Input voice command “my name is noushad”
Figure 9: Making a Note
VII. FUTURE WORK
This project has enormous potential.
Integrate AI in this project.
Make a user friendly UI
VIII. CONCLUSION
IX. This Proposed system is capable to control computer
using voice command. This system used python library
which is used for text to speech, speech recognition,
web automation, mouse and keyboard control. This
system can open my computer root file, traverse any
folder in computer, can open browser and can also
search in browser. It can make a note as user wish. It
can open any exe file and close that file.
REFERENCES
[1] “voice recognition, definition of”. WebFinance Inc. Archived from the
original on 3rd
December 2011. Retrieved 21 February 2012.
[2] “voice recognition, definition of”. WebFinance Inc. Archived from the
original on 3rd
December 2011. Retrieved 21 February 2012.
[3] https://towardsdatascience.cojm m/build-your-first-voice-assistant-
85a5a49f6cc1
[4] https://realpython.com/python-speech-recognition/
[5] https://www.geeksforgeeks.org/speech-recognition-in-python-using-google-
speech-api/
[6] https://www.python.org/doc/
AUTHORS
First Author – Mohammad Liton Hossain, Assistant Professor,
Department of ECE, Institute of Science and Technology (IST)
Email: litu702@gmail.com
Second Author – Md. Noushad Rana, Student, Institute of
Science and Technology.