It is easy for human to recognize familiar voice but using computer programs to identify a voice when compared with others is a herculean task. This is due to the problem that is encountered when developing the algorithm to recognize human voice. It is impossible to say a word the same way in two different occasions. Human speech analysis by computer gives different interpretation based on varying speed of speech delivery. This research paper gives detail description of the process behind implementation of an effective voice recognition algorithm. The algorithm utilize discrete Fourier transform to compare the frequency spectra of two voice samples because it remained unchanged as speech is slightly varied. Chebyshev inequality is then used to determine whether the two voices came from the same person. The algorithm is implemented and tested using MATLAB.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Esophageal Speech Recognition using Artificial Neural Network (ANN)Saibur Rahman
Esophageal Speech Recognition using Artificial Neural Network (ANN). In our presentation shows that how to recognize normal speech and Esophageal speech using ANN. We compared our method with other methods and show that our method is better then other method.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Esophageal Speech Recognition using Artificial Neural Network (ANN)Saibur Rahman
Esophageal Speech Recognition using Artificial Neural Network (ANN). In our presentation shows that how to recognize normal speech and Esophageal speech using ANN. We compared our method with other methods and show that our method is better then other method.
Complete power point presentation on SPEECH RECOGNITION TECHNOLOGY.
Very helpful for final year students for their seminar.
One can use this presentation as their final year seminar.
Speech Recognition is a very interesting topic for seminar.
Utterance Based Speaker Identification Using ANNIJCSEA Journal
In this paper we present the implementation of speaker identification system using artificial neural network with digital signal processing. The system is designed to work with the text-dependent speaker identification for Bangla Speech. The utterances of speakers are recorded for specific Bangla words using an audio wave recorder. The speech features are acquired by the digital signal processing technique. The identification of speaker using frequency domain data is performed using backpropagation algorithm. Hamming window and Blackman-Harris window are used to investigate better speaker identification performance. Endpoint detection of speech is developed in order to achieve high accuracy of the system.
Utterance Based Speaker Identification Using ANNIJCSEA Journal
In this paper we present the implementation of speaker identification system using artificial neural network with digital signal processing. The system is designed to work with the text-dependent speaker identification for Bangla Speech. The utterances of speakers are recorded for specific Bangla words using an audio wave recorder. The speech features are acquired by the digital signal processing technique. The identification of speaker using frequency domain data is performed using back propagation algorithm. Hamming window and Blackman-Harris window are used to investigate better speaker identification performance. Endpoint detection of speech is developed in order to achieve high accuracy of the system.
The project was started with a sole aim in mind that the design should be able to recognize the voice of a person by analyzing the speech signal. The simulation is done in MATLAB. The design of the project is based on using the Linear prediction filter coefficient (LPC) and Principal component analysis (PCA) on data (princomp) for the speech signal analysis. The Sample Collection process is accomplished by using the microphone to record the speech of male/female. After executing the program the speech is analyzed by the analysis part of our MATLAB program code and our design should be able to identify and give the judgment that the recorded speech signal is same as that of our desired output.
In this paper we present the implementation of speaker identification system using artificial neural network
with digital signal processing. The system is designed to work with the text-dependent speaker
identification for Bangla Speech. The utterances of speakers are recorded for specific Bangla words using
an audio wave recorder. The speech features are acquired by the digital signal processing technique. The
identification of speaker using frequency domain data is performed using backpropagation algorithm.
Hamming window and Blackman-Harris window are used to investigate better speaker identification
performance. Endpoint detection of speech is developed in order to achieve high accuracy of the system.
Deep Learning for Speech Recognition - Vikrant Singh TomarWithTheBest
Tomar discusses the components of speech recognition, the difference between deep learning for speech and images, system architecture, GMM-HMM based systems, deep neural networks in speech, tandem DNN, and hybrids. There's a lot of exciting stuff to talk about in deep learning communities.
Vikrant Singh Tomar, Founder, Fluent.ai
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task.
This is a ppt on speech recognition system or automated speech recognition system. I hope that it would be helpful for all the people searching for a presentation on this technology
Algebraic Fault Attack on the SHA-256 Compression FunctionIJORCS
The cryptographic hash function SHA-256 is one member of the SHA-2 hash family, which was proposed in 2000 and was standardized by NIST in 2002 as a successor of SHA-1. Although the differential fault attack on SHA-1compression function has been proposed, it seems hard to be directly adapted to SHA-256. In this paper, an efficient algebraic fault attack on SHA-256 compression function is proposed under the word-oriented random fault model. During the attack, an automatic tool STP is exploited, which constructs binary expressions for the word-based operations in SHA-256 compression function and then invokes a SAT solver to solve the equations. The simulation of the new attack needs about 65 fault injections to recover the chaining value and the input message block with about 200 seconds on average. Moreover, based on the attack on SHA-256 compression function, an almost universal forgery attack on HMAC-SHA-256 is presented. Our algebraic fault analysis is generic, automatic and can be applied to other ARX-based primitives.
Using Virtualization Technique to Increase Security and Reduce Energy Consump...IJORCS
An approach has been presented in this paper in order to generate a secure environment on internet Based Virtual Computing platform and also to reduce energy consumption in green cloud computing. The proposed approach constantly checks the accuracy of stored data by means of a central control service inside the network environment and also checks system security through isolating single virtual machines using a common virtual environment. This approach has been simulated on two types of Virtual Machine Manager (VMM) Quick EMUlator (Qemu), HVM (Hardware Virtual Machine) Xen and outputs of the simulation in VMInsight show that when service is getting singly used, the overhead of its performance will be increased. As a secure system, the proposed approach is able to recognize malicious behaviors and assure service security by means of operational integrity measurement. Moreover, the rate of system efficiency has been evaluated according to the amount of energy consumption on five applications (Defragmentation, Compression, Linux Boot Decompression and Kernel Boot). Therefore, this has been resulted that to secure multi-tenant environment, managers and supervisors should independently install a security monitoring system for each Virtual Machines (VMs) which will come up to have the management heavy workload of. While the proposed approach, can respond to all VM’s with just one virtual machine as a supervisor.
Complete power point presentation on SPEECH RECOGNITION TECHNOLOGY.
Very helpful for final year students for their seminar.
One can use this presentation as their final year seminar.
Speech Recognition is a very interesting topic for seminar.
Utterance Based Speaker Identification Using ANNIJCSEA Journal
In this paper we present the implementation of speaker identification system using artificial neural network with digital signal processing. The system is designed to work with the text-dependent speaker identification for Bangla Speech. The utterances of speakers are recorded for specific Bangla words using an audio wave recorder. The speech features are acquired by the digital signal processing technique. The identification of speaker using frequency domain data is performed using backpropagation algorithm. Hamming window and Blackman-Harris window are used to investigate better speaker identification performance. Endpoint detection of speech is developed in order to achieve high accuracy of the system.
Utterance Based Speaker Identification Using ANNIJCSEA Journal
In this paper we present the implementation of speaker identification system using artificial neural network with digital signal processing. The system is designed to work with the text-dependent speaker identification for Bangla Speech. The utterances of speakers are recorded for specific Bangla words using an audio wave recorder. The speech features are acquired by the digital signal processing technique. The identification of speaker using frequency domain data is performed using back propagation algorithm. Hamming window and Blackman-Harris window are used to investigate better speaker identification performance. Endpoint detection of speech is developed in order to achieve high accuracy of the system.
The project was started with a sole aim in mind that the design should be able to recognize the voice of a person by analyzing the speech signal. The simulation is done in MATLAB. The design of the project is based on using the Linear prediction filter coefficient (LPC) and Principal component analysis (PCA) on data (princomp) for the speech signal analysis. The Sample Collection process is accomplished by using the microphone to record the speech of male/female. After executing the program the speech is analyzed by the analysis part of our MATLAB program code and our design should be able to identify and give the judgment that the recorded speech signal is same as that of our desired output.
In this paper we present the implementation of speaker identification system using artificial neural network
with digital signal processing. The system is designed to work with the text-dependent speaker
identification for Bangla Speech. The utterances of speakers are recorded for specific Bangla words using
an audio wave recorder. The speech features are acquired by the digital signal processing technique. The
identification of speaker using frequency domain data is performed using backpropagation algorithm.
Hamming window and Blackman-Harris window are used to investigate better speaker identification
performance. Endpoint detection of speech is developed in order to achieve high accuracy of the system.
Deep Learning for Speech Recognition - Vikrant Singh TomarWithTheBest
Tomar discusses the components of speech recognition, the difference between deep learning for speech and images, system architecture, GMM-HMM based systems, deep neural networks in speech, tandem DNN, and hybrids. There's a lot of exciting stuff to talk about in deep learning communities.
Vikrant Singh Tomar, Founder, Fluent.ai
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task.
This is a ppt on speech recognition system or automated speech recognition system. I hope that it would be helpful for all the people searching for a presentation on this technology
Algebraic Fault Attack on the SHA-256 Compression FunctionIJORCS
The cryptographic hash function SHA-256 is one member of the SHA-2 hash family, which was proposed in 2000 and was standardized by NIST in 2002 as a successor of SHA-1. Although the differential fault attack on SHA-1compression function has been proposed, it seems hard to be directly adapted to SHA-256. In this paper, an efficient algebraic fault attack on SHA-256 compression function is proposed under the word-oriented random fault model. During the attack, an automatic tool STP is exploited, which constructs binary expressions for the word-based operations in SHA-256 compression function and then invokes a SAT solver to solve the equations. The simulation of the new attack needs about 65 fault injections to recover the chaining value and the input message block with about 200 seconds on average. Moreover, based on the attack on SHA-256 compression function, an almost universal forgery attack on HMAC-SHA-256 is presented. Our algebraic fault analysis is generic, automatic and can be applied to other ARX-based primitives.
Using Virtualization Technique to Increase Security and Reduce Energy Consump...IJORCS
An approach has been presented in this paper in order to generate a secure environment on internet Based Virtual Computing platform and also to reduce energy consumption in green cloud computing. The proposed approach constantly checks the accuracy of stored data by means of a central control service inside the network environment and also checks system security through isolating single virtual machines using a common virtual environment. This approach has been simulated on two types of Virtual Machine Manager (VMM) Quick EMUlator (Qemu), HVM (Hardware Virtual Machine) Xen and outputs of the simulation in VMInsight show that when service is getting singly used, the overhead of its performance will be increased. As a secure system, the proposed approach is able to recognize malicious behaviors and assure service security by means of operational integrity measurement. Moreover, the rate of system efficiency has been evaluated according to the amount of energy consumption on five applications (Defragmentation, Compression, Linux Boot Decompression and Kernel Boot). Therefore, this has been resulted that to secure multi-tenant environment, managers and supervisors should independently install a security monitoring system for each Virtual Machines (VMs) which will come up to have the management heavy workload of. While the proposed approach, can respond to all VM’s with just one virtual machine as a supervisor.
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveIJORCS
This Paper is a review study of FPGA implementation of Finite Impulse response (FIR) with low cost and high performance. The key observation of this paper is an elaborate analysis about hardware implementations of FIR filters using different algorithm i.e., Distributed Arithmetic (DA), DA-Offset Binary Coding (DA-OBC), Common Sub-expression Elimination (CSE) and sum-of-power-of-two (SOPOT) with less resources and without affecting the performance of the original FIR Filter.
License plate recognition system is one of the core technologies in intelligent traffic control. In this paper, a new and tunable algorithm which can detect multiple license plates in high resolution applications is proposed. The algorithm aims at investigation into and identification of the novel Iranian and some European countries plate, characterized by both inclusion of blue area on it and its geometric shape. Obviously, the suggested algorithm contains suitable velocity due to not making use of heavy pre-processing operation such as image-improving filters, edge-detection operation and omission of noise at the beginning stages. So, the recommended method of ours is compatible with model-adaptation, i.e., the very blue section of the plate so that the present method indicated the fact that if several plates are included in the image, the method can successfully manage to detect it. We evaluated our method on the two Persian single vehicle license plate data set that we obtained 99.33, 99% correct recognition rate respectively. Further we tested our algorithm on the Persian multiple vehicle license plate data set and we achieved 98% accuracy rate. Also we obtained approximately 99% accuracy in character recognition stage.
Help the Genetic Algorithm to Minimize the Urban Traffic on IntersectionsIJORCS
Control of traffic lights at the intersections of the main issues is the optimal traffic. Intersections to regulate traffic flow of vehicles and eliminate conflicting traffic flows are used. Modeling and simulation of traffic are widely used in industry. In fact, the modeling and simulation of an industrial system is studied before creating economically and when it is affordable. The aim of this article is a smart way to control traffic. The first stage of the project with the objective of collecting statistical data (cycle time of each of the intersection of the lights of vehicles is waiting for a red light) steps where the data collection found optimal amounts next it is. Introduced by genetic algorithm optimization of parameters is performed. GA begin with coding step as a binary variable (the range specified by the initial data set is obtained) will start with an initial population and then a new generation of genetic operators mutation and crossover and will Finally, the members of the optimal fitness values are selected as the solution set. The optimal output of Petri nets CPN TOOLS modeling and software have been implemented. The results indicate that the performance improvement project in intersections traffic control systems. It is known that other data collected and enforced intersections of evolutionary methods such as genetic algorithms to reduce the waiting time for traffic lights behind the red lights and to determine the appropriate cycle.
Welcoming the research scholars, scientists around the globe in the Open Access Dimension, IJORCS is now accepting manuscripts for its next issue (Volume 4, Issue 4). Authors are encouraged to contribute to the research community by submitting to IJORCS, articles that clarify new research results, projects, surveying works and industrial experiences that describe significant advances in field of computer science.
All paper submissions (http://www.ijorcs.org/submit-paper) are received and managed electronically by IJORCS Team. Detailed instructions about the submission procedure are available on IJORCS website (http://www.ijorcs.org/author-guidelines)
Industrial Energy Management and the Emerging ISO 50001 StandardSchneider Electric
The energy dilemma poses a significant challenge, especially to U.S.-based industrial end users. How will rising energy prices and increased carbon emissions restrictions impact operational costs in the future? How can energy control and forecasting increase efficiency and build a competitive advantage? This presentation explores energy dashboards as a solution and educate you on the ISO 50001 standard and what it takes to comply.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Speech to text conversion for visually impaired person using µ law compandingiosrjce
The paper represents the overall design and implementation of DSP based speech recognition and
text conversion system. Speech is usually taken as a preferred mode of operation for human being, This paper
represent voice oriented command for converting into text. We intended to compute the entire speech processing
in real time. This involves simultaneously accepting the input from the user and using software filters to analyse
the data. The comparison was then to be established by using correlation and µ law companding techniques. In
this paper, voice recognition is carried out using MATLAB. The voice command is a person independent. The
voice command is stored in the data base with the help of the function keys. The real time input speech received
is then processed in the speech recognition system where the required feature of the speech words are extracted,
filtered out and matched with the existing sample stored in the database. Then the required MATLAB processes
are done to convert the received data and into text form.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Simulation of speech recognition using correlation method on matlab softwareVaishaliVaishali14
The following content gives a brief detail about the topic
INTRODUCTION
VOICE RECOGNITION
TYPES OF VOICE RECOGNITION SYSTEMS
CORRELATION
PROGRAM
PROGRAM EXPLANATION
OUTPUTS
INFERENCE
REFERENCE
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemeskevig
Speech synthesis and recognition are the basic techniques used for man-machine communication. This type
of communication is valuable when our hands and eyes are busy in some other task such as driving a
vehicle, performing surgery, or firing weapons at the enemy. Dynamic time warping (DTW) is mostly used
for aligning two given multidimensional sequences. It finds an optimal match between the given sequences.
The distance between the aligned sequences should be relatively lesser as compared to unaligned
sequences. The improvement in the alignment may be estimated from the corresponding distances. This
technique has applications in speech recognition, speech synthesis, and speaker transformation. The
objective of this research is to investigate the amount of improvement in the alignment corresponding to the
sentence based and phoneme based manually aligned phrases. The speech signals in the form of twenty five
phrases were recorded from each of six speakers (3 males and 3 females). The recorded material was
segmented manually and aligned at sentence and phoneme level. The aligned sentences of different speaker
pairs were analyzed using HNM and the HNM parameters were further aligned at frame level using DTW.
Mahalanobis distances were computed for each pair of sentences. The investigations have shown more than
20 % reduction in the average Mahalanobis distances.
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESkevig
Speech synthesis and recognition are the basic techniques used for man-machine communication. This type
of communication is valuable when our hands and eyes are busy in some other task such as driving a
vehicle, performing surgery, or firing weapons at the enemy. Dynamic time warping (DTW) is mostly used
for aligning two given multidimensional sequences. It finds an optimal match between the given sequences.
The distance between the aligned sequences should be relatively lesser as compared to unaligned
sequences. The improvement in the alignment may be estimated from the corresponding distances. This
technique has applications in speech recognition, speech synthesis, and speaker transformation. The
objective of this research is to investigate the amount of improvement in the alignment corresponding to the
sentence based and phoneme based manually aligned phrases. The speech signals in the form of twenty five
phrases were recorded from each of six speakers (3 males and 3 females). The recorded material was
segmented manually and aligned at sentence and phoneme level. The aligned sentences of different speaker
pairs were analyzed using HNM and the HNM parameters were further aligned at frame level using DTW.
Mahalanobis distances were computed for each pair of sentences. The investigations have shown more than
20 % reduction in the average Mahalanobis distances.
Artificial Intelligence - An Introduction acemindia
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines "man made," and intelligence defines "thinking power", hence AI means "a man-made thinking power.“
Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning, and solving problems.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines "man-made," and intelligence defines "thinking power", hence AI means "a man-made thinking power.“
An Introduction to Various Features of Speech SignalSpeech featuresSivaranjan Goswami
An overview of various temporal, spectral and cepstral features of speech signal used in digital speech processing.
For more tutorials visit:
https://sites.google.com/site/enggprojectece
This paper contains a report on an Audio-Visual Client Recognition System using Matlab software which identifies five clients and can be improved to identify as many clients as possible depending on the number of clients it is trained to identify which was successfully implemented. The implementation was accomplished first by visual recognition system implemented using The Principal Component Analysis, Linear Discriminant Analysis and Nearest Neighbour Classifier. A successful implementation of second part was achieved by audio recognition using Mel-Frequency Cepstrum Coefficient, Linear Discriminant Analysis and Nearest Neighbour Classifier the system was tested using images and sounds that have not been trained to the system to see whether it can detect an intruder which lead us to a very successful result with précised response to intruder.
Enhancement of DES Algorithm with Multi State LogicIJORCS
The principal goal to design any encryption algorithm must be the security against unauthorized access or attacks. Data Encryption Standard algorithm is a symmetric key algorithm and it is used to secure the data. Enhanced DES algorithm works on increasing the key length or complex S-BOX design or increased the number of states in which the information is to be represented or combination of above criteria. By increasing the key length, the number of combinations for key will increase which is hard for the intruder to do the brute force attack. As the S-BOX design will become the complex there will be a good avalanche effect. As the number of states increases in which the information is represented, it is hard for the intruder to crack the actual information. Proposed algorithm replace the predefined XOR operation applied during the 16 round of the standard algorithm by a new operation called “Hash function” depends on using two keys. One key used in “F” function and another key consists of a combination of 16 states (0,1,2…13,14,15) instead of the ordinary 2 state key (0, 1). This replacement adds a new level of protection strength and more robustness against breaking methods.
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...IJORCS
This paper presents a new algorithm for solving large scale global optimization problems based on hybridization of simulated annealing and Nelder-Mead algorithm. The new algorithm is called simulated Nelder-Mead algorithm with random variables updating (SNMRVU). SNMRVU starts with an initial solution, which is generated randomly and then the solution is divided into partitions. The neighborhood zone is generated, random number of partitions are selected and variables updating process is starting in order to generate a trail neighbor solutions. This process helps the SNMRVU algorithm to explore the region around a current iterate solution. The Nelder- Mead algorithm is used in the final stage in order to improve the best solution found so far and accelerates the convergence in the final stage. The performance of the SNMRVU algorithm is evaluated using 27 scalable benchmark functions and compared with four algorithms. The results show that the SNMRVU algorithm is promising and produces high quality solutions with low computational costs.
Welcoming the research scholars, scientists around the globe in the Open Access Dimension, IJORCS is now accepting manuscripts for its next issue (Volume 4, Issue 2). Authors are encouraged to contribute to the research community by submitting to IJORCS, articles that clarify new research results, projects, surveying works and industrial experiences that describe significant advances in field of computer science.
To view complete list of topics coverage of IJORCS, Aim & Scope, please visit, www.ijorcs.org/scope
Welcoming the research scholars, scientists around the globe in the Open Access Dimension, IJORCS is now accepting manuscripts for its next issue (Volume 4, Issue 1). Authors are encouraged to contribute to the research community by submitting to IJORCS, articles that clarify new research results, projects, surveying works and industrial experiences that describe significant advances in field of computer science.
Channel Aware Mac Protocol for Maximizing Throughput and FairnessIJORCS
The proper channel utilization and the queue length aware routing protocol is a challenging task in MANET. To overcome this drawback we are extending the previous work by improving the MAC protocol to maximize the Throughput and Fairness. In this work we are estimating the channel condition and Contention for a channel aware packet scheduling and the queue length is also calculated for the routing protocol which is aware of the queue length. The channel is scheduled based on the channel condition and the routing is carried out by considering the queue length. This queue length will provide a measurement of traffic load at the mobile node itself. Depending upon this load the node with the lesser load will be selected for the routing; this will effectively balance the load and improve the throughput of the ad hoc network.
A Review and Analysis on Mobile Application Development Processes using Agile...IJORCS
Over a last decade, mobile telecommunication industry has observed a rapid growth, proved to be highly competitive, uncertain and dynamic environment. Besides its advancement, it has also raised number of questions and gained concern both in industry and research. The development process of mobile application differs from traditional softwares as the users expect same features similar to their desktop computer applications with additional mobile specific functionalities. Advanced mobile applications require assimilation with existing enterprise computing systems such as databases, legacy applications and Web services. In addition, the lifecycle of a mobile application moves much faster than that of a traditional Web application and therefore the lifecycle management associated therein must be adjusted accordingly. The Security and application testing are more stimulating and interesting in mobile application than in Web applications since the technology in mobile devices progresses rapidly and developers must stay in touch with the latest developments, news and trends in their area of work. With the rising competence of software market, researchers are seeking more flexible methods that can adjust to dynamic situations where software system requirements are changing over time, producing valuable software in short duration and within low budget. The intrinsic uncertainty and complexity in any software project therefore requires an iterative developmental plan to cope with uncertainty and a large number of unknown variables. Agile Methodologies were thus introduced to meet the new requirements of the software development companies. The agile methodologies aim at facilitating software development processes where changes are acceptable at any stage and provide a structure for highly collaborative software development. Therefore, the present paper aims in reviewing and analysing different prevalent methodologies utilizing agile techniques that are currently in use for the development of mobile applications. This paper provides a detailed review and analysis on the use of agile methodologies in the proposed processes associated with mobile application skills and highlights its benefit and constraints. In addition, based on this analysis, future research needs are identified and discussed.
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...IJORCS
In general, nodes in Wireless Sensor Networks (WSNs) are equipped with limited battery and computation capabilities but the occurrence of congestion consumes more energy and computation power by retransmitting the data packets. Thus, congestion should be regulated to improve network performance. In this paper, we propose a congestion prediction and adaptive rate adjustment technique for Wireless Sensor Networks. This technique predicts congestion level using fuzzy logic system. Node degree, data arrival rate and queue length are taken as inputs to the fuzzy system and congestion level is obtained as an outcome. When the congestion level is amidst moderate and maximum ranges, adaptive rate adjustment technique is triggered. Our technique prevents congestion by controlling data sending rate and also avoids unsolicited packet losses. By simulation, we prove the proficiency our technique. It increases system throughput and network performance significantly.
A Study of Routing Techniques in Intermittently Connected MANETsIJORCS
A Mobile Ad hoc Network (MANET) is a self-configuring infrastructure less network of mobile devices connected by wireless. These are a kind of wireless Ad hoc Networks that usually has a routable networking environment on top of a Link Layer Ad hoc Network. The routing approach in MANET includes mainly three categories viz., Reactive Protocols, Proactive Protocols and Hybrid Protocols. These traditional routing schemes are not pertinent to the so called Intermittently Connected Mobile Ad hoc Network (ICMANET). ICMANET is a form of Delay Tolerant Network, where there never exists a complete end – to – end path between two nodes wishing to communicate. The intermittent connectivity araise when network is sparse or highly mobile. Routing in such a spasmodic environment is arduous. In this paper, we put forward the indication of prevailing routing approaches for ICMANET with their benefits and detriments
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...IJORCS
In the field of speech signal processing, Spectral subtraction method (SSM) has been successfully implemented to suppress the noise that is added acoustically. SSM does reduce the noise at satisfactory level but musical noise is a major drawback of this method. To implement spectral subtraction method, transformation of speech signal from time domain to frequency domain is required. On the other hand, Wavelet transform displays another aspect of speech signal. In this paper we have applied a new approach in which SSM is cascaded with wavelet thresholding technique (WTT) for improving the quality of speech signal by removing the problem of musical noise to a great extent. Results of this proposed system have been simulated on MATLAB.
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed SystemIJORCS
Due to the restriction of designing faster and faster computers, one has to find the ways to maximize the performance of the available hardware. A distributed system consists of several autonomous nodes, where some nodes are busy with processing, while some nodes are idle without any processing. To make better utilization of the hardware, the tasks or load of the overloaded node will be sent to the under loaded node that has less processing weight to minimize the response time of the tasks. Load balancing is a tool used effectively for balancing the load among the systems. Dynamic load balancing takes into account of the current system state for migration of the tasks from heavily loaded nodes to the lightly loaded nodes. In this paper, we devised an adaptive load-sharing algorithm to balance the load by taking into consideration of connectivity among the nodes, processing capacity of each node and link capacity.
The Design of Cognitive Social Simulation Framework using Statistical Methodo...IJORCS
Modeling the behavior of the cognitive architecture in the context of social simulation using statistical methodologies is currently a growing research area. Normally, a cognitive architecture for an intelligent agent involves artificial computational process which exemplifies theories of cognition in computer algorithms under the consideration of state space. More specifically, for such cognitive system with large state space the problem like large tables and data sparsity are faced. Hence in this paper, we have proposed a method using a value iterative approach based on Q-learning algorithm, with function approximation technique to handle the cognitive systems with large state space. From the experimental results in the application domain of academic science it has been verified that the proposed approach has better performance compared to its existing approaches.
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...IJORCS
To efficiently process continuous spatio-temporal queries, we need to efficiently and effectively handle large number of moving objects and continuous updates on these queries. In this paper, we propose a framework that employs a new indexing algorithm that is built on top of SQL Server 2008 and avoid the overhead related to R-Tree indexing. To answer range queries, we utilize dynamic materialized view concept to efficiently handle update queries. We propose an adaptive safe region to reduce communication costs between the client and the server and to minimize position update load. Caching of results was utilized to enhance the overall performance of the framework. To handle concurrent spatio-temporal queries, we utilize publish/subscribe paradigm to group similar queries and efficiently process these requests. Experiments show that the overall proposed framework performance was able to outperform R-Tree index and produce promising and satisfactory results.
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Fast and high-quality clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information. Recent studies have shown that partitional clustering algorithms such as the k-means algorithm are the most popular algorithms for clustering large datasets. The major problem with partitional clustering algorithms is that they are sensitive to the selection of the initial partitions and are prone to premature converge to local optima. Subtractive clustering is a fast, one-pass algorithm for estimating the number of clusters and cluster centers for any given set of data. The cluster estimates can be used to initialize iterative optimization-based clustering methods and model identification methods. In this paper, we present a hybrid Particle Swarm Optimization, Subtractive + (PSO) clustering algorithm that performs fast clustering. For comparison purpose, we applied the Subtractive + (PSO) clustering algorithm, PSO, and the Subtractive clustering algorithms on three different datasets. The results illustrate that the Subtractive + (PSO) clustering algorithm can generate the most compact clustering results as compared to other algorithms.
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks IJORCS
In HMIPv6 Networks, most of the existing handoff decision mechanisms deal mainly with the selection of Mobility Anchor Point (MAP), ignoring the selection of access router (AR) under each MAP. In this paper, we propose a new mechanism called “Dynamic MAP and Diffserv based ARs selection for Handoff in HMIPv6 networks” and it deals with selecting the MAP as well as ARs. MAP will be selected dynamically by checking load, session mobility ratio (SMR), Binding update cost and Location Rate. After selecting the best MAP, the Diffserv approach is used to select the AR under the MAP, based on its resource availability. The AR is implemented at the edge router of Diffserv. DiffServ can be used to provide low-latency to critical network traffic such as voice or streaming media while providing simple best-effort service to non-critical services such as web traffic or file transfers. By using this mechanism, we can assure that better resource utilization and throughput can be attained during Handoff in HMIPv6 networks.
From Physical to Virtual Wireless Sensor Networks using Cloud Computing IJORCS
In the modern world, billions of physical sensors are used for various dedications: Environment Monitoring, Healthcare, Education, Defense, Manufacturing, Smart Home, Agriculture Precision and others. Nonetheless, they are frequently utilized by their own applications and thereby snubbing the significant possibilities of sharing the resources in order to ensure the availability and performance of physical sensors. This paper assumes that the immense power of the Cloud can only be fully exploited if it is impeccably integrated into our physical lives. The principal merit of this work is a novel architecture where users can share several types of physical sensors easily and consequently many new services can be provided via a virtualized structure that allows allocation of sensor resources to different users and applications under flexible usage scenarios within which users can easily collect, access, process, visualize, archive, share and search large amounts of sensor data from different applications. Moreover, an implementation has been achieved using Arduino-Atmega328 as hardware platform and Eucalyptus/Open Stack with Orchestra-Juju for Private Sensor Cloud. Then this private Cloud has been connected to some famous public clouds such as Amazon EC2, ThingSpeak, SensorCloud and Pachube. The testing was successful at 80%. The recommendation for future work would be to improve the effectiveness of virtual sensors by applying optimization techniques and other methods.
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...IJORCS
Prediction of Atmospheric Pressure is one important and challenging task that needs lot of attention and study for analyzing atmospheric conditions. Advent of digital computers and development of data driven artificial intelligence approaches like Artificial Neural Networks (ANN) have helped in numerical prediction of pressure. However, very few works have been done till now in this area. The present study developed an ANN model based on the past observations of several meteorological parameters like temperature, humidity, air pressure and vapour pressure as an input for training the model. The novel architecture of the proposed model contains several multilayer perceptron network (MLP) to realize better performance. The model is enriched by analysis of alternative hybrid model of k-means clustering and MLP. The improvement of the performance in the prediction accuracy has been demonstrated by the automatic selection of the appropriate cluster.
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...IJORCS
In this article, we present a new Ant-routing algorithm with colored pheromones and clustering techniques for satisfying users’ Quality of Service (QoS) requirements in Wireless Sensor Networks (WSNs). An important problem is to detect the best route from a source node to the destination node. Moreover, it is considered that the feature of non-uniformly distributed traffic load and possibility existing of the traffic requiring various performances; therefore, it is assumed the different class of traffic required for QoS of communication. In this paper, novel protocol, the suitability of using meta-heuristic an ant colony optimization based on energy saving and multi objectives, the demand of QoS routing protocol for WSN will be very adaptive ,resident power and mainly decrease end-to-end delay. These metrics are used by colored pheromones adapted to the traffic classes. Moreover, we reinforce the proposed method for scalability issue by clustering techniques. We use a proactive route discover algorithms in clusters and reactive discovery mechanism between different clusters. Compared to existing QoS routing protocols, the novel algorithm has been designed for various service categories such as real time (RT) and best effort (BE) traffic, resulted lower packet deadline miss ratio and higher energy efficiency and better QoS and longer lifetime.
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...IJORCS
This article introduces a novel image encryption algorithm based on DNA addition combining and coupled two-dimensional piecewise nonlinear chaotic map. This algorithm consists of two parts. In the first part of the algorithm, a DNA sequence matrix is obtained by encoding each color component, and is divided into some equal blocks and then the generated sequence of Sugeno integral fuzzy and the DNA sequence addition operation is used to add these blocks. Next, the DNA sequence matrix from the previous step is decoded and the complement operation to the result of the added matrix is performed by using Sugeno fuzzy integral. In the second part of the algorithm, the three modified color components are encrypted in a coupling fashion in such a way to strengthen the cryptosystem security. It is observed that the histogram, the correlation and avalanche criterion, can satisfy security and performance requirements (Avalanche criterion > 0.49916283). The experimental results obtained for the CVG-UGR image databases reveal the fact that the proposed algorithm is suitable for practical use to protect the security of digital image information over the Internet.
Can “Feature” be used to Model the Changing Access Control Policies? IJORCS
Access control policies [ACPs] regulate the access to data and resources in information systems. These ACPs are framed from the functional requirements and the Organizational security & privacy policies. It was found to be beneficial, when the ACPs are included in the early phases of the software development leading to secure development of information systems. Many approaches are available for including the ACPs in requirements and design phase. They relied on UML artifacts, Aspects and also Feature for this purpose. But the earlier modeling approaches are limited in expressing the evolving ACPs due to organizational policy changes and business process modifications. In this paper, we analyze, whether “Feature”- defined as an increment in program functionality can be used as a modeling entity to represent the Evolving Access control requirements. We discuss the two prominent approaches that use Feature in modeling ACPs. Also we have a comparative analysis to find the suitability of Features in the context of changing ACPs. We conclude with our findings and provide directions for further research.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
Voice Recognition System using Template Matching
1. International Journal of Research in Computer Science
eISSN 2249-8265 Volume 3 Issue 5 (2013) pp. 13-17
www.ijorcs.org, A Unit of White Globe Publications
doi: 10.7815/ijorcs. 35.2013.070
www.ijorcs.org
VOICE RECOGNITION SYSTEM USING TEMPLATE
MATCHING
Luqman Gbadamosi
Computer Science Department, Lagos State Polytechnic, Lagos, Nigeria
Email: luqmangbadamosi@yahoo.com
Abstract: It is easy for human to recognize familiar
voice but using computer programs to identify a voice
when compared with others is a herculean task. This
is due to the problem that is encountered when
developing the algorithm to recognize human voice. It
is impossible to say a word the same way in two
different occasions. Human speech analysis by
computer gives different interpretation based on
varying speed of speech delivery. This research paper
gives detail description of the process behind
implementation of an effective voice recognition
algorithm. The algorithm utilize discrete Fourier
transform to compare the frequency spectra of two
voice samples because it remained unchanged as
speech is slightly varied. Chebyshev inequality is then
used to determine whether the two voices came from
the same person. The algorithm is implemented and
tested using MATLAB.
Keywords: chebyshev’s inequality, discrete fourier
transform, frequency spectra, voice recognition.
I. INTRODUCTION
Voice Recognition or Voice Authentication is an
automated method of identification of the person who
is speaking by the characteristics of their voice
biometrics. Voice is one of many forms of biometrics
used to identify an individual and verify their identity.
Naturally human can recognize a familiar voice but
getting computer to do the same is more difficult task.
This is due to the fact that it is impossible to say a
word exactly the same way on two different
occasions. Advancement in computing capabilities
has led to a more effective way of recognizing human
voice using feature extraction. Voice recognition
system is one of the best and highly effective
biometrics technique which could be used for
telephone banking and forensic investigation by law
enforcement agency. [9][10]
A. What is Human Voice?
The voice is made up of sound made by human
being using vocal folds for talking, singing, laughing,
crying, screaming etc. The human voice is specifically
that part of human sound production in which the
vocal folds are the primary sound source. The
mechanism for generating the human voice can be
subdivided into three; the lungs, the vocal folds within
the larynx, and the articulators. [11]
Figure1:Thespectrogramofhuman voicerevealsits reach
harmonic content.
B. What is Voice Recognition?
Voice Recognition (sometimes referred to as
Speaker Recognition) is the identification of the
person who is speaking by extracting the feature of
their voices when a questioned voice print is
compared against a known voice print. This
technology involves sounds, words or phrases spoken
by humans are converted into electrical signals, and
these signals are transformed into coding patterns to
which meaning has been assigned. There are two
major applications of voice recognition technologies
and methodologies. The first is voice verification or
authentication which is used to verify the speaker
claims to be of a certain identity and the voice is used
to verify this claim. The second is voice identification
which is the task of determining an unknown
speaker’s identity. In a better perspective, voice
verification is one to one matching where one
speaker’s voice is matched to one template or voice
print, whereas voice identification is one to many
matching where the speaker’s voice is compared
against many voice templates.
Speaker recognition system has two phases:
Enrollment and Verification. During enrollment, the
speaker’s voice is recorded and typically a number of
features are extracted to form a voice print or
template. In the verification phase, a speech sample or
“utterance” is compared against a previously created
2. 14 Luqman Gbadamosi
www.ijorcs.org
voice print. For identification systems, the utterance is
compared against multiple voice prints in order to
determine the best match while verification systems
compare an utterance against a single voice print.
Voice Recognition Systems can also be categorized
into two: text independent and text dependent. [9]
Text-Dependent: This means text must be the same
for the enrollment and verification. The use of shared-
secret passwords and PINs or knowledge-based
information can be employed in order to create a
multi-factor authentication scenario.
Text Independent: Text-Independent systems are most
often used for speaker identification as they require
very little cooperation by the speaker. In this case the
text used during enrollment is different from the text
during verification. In fact, the enrollment may
happen without the user’s knowledge, as in the case
for many forensic applications. [9]
C. Voice Recognition Techniques
The most common approaches to voice recognition
can be divided into two classes: Template Matching
and Feature Analysis.
Template Matching: Template matching is the
simplest technique and has the highest accuracy when
used properly, but it also suffers from the most
limitations. As with any approach to voice
recognition, the first step is for the user to speak a
word or phrase into a microphone. The electrical
signal from the microphone is digitized by an
"analog-to-digital (A/D) converter", and is stored in
memory. To determine the "meaning" of this voice
input, the computer attempts to match the input with a
digitized voice sample, or template that has a known
meaning. This technique is a close analogy to the
traditional command inputs from a keyboard. The
program contains the input template, and attempts to
match this template with the actual input using a
simple conditional statement. This type of system is
known as "speaker dependent." and recognition
accuracy can be about 98 percent.
Feature Analysis: A more general form of voice
recognition is available through feature analysis and
this technique usually leads to "speaker-independent"
voice recognition. Instead of trying to find an exact or
near-exact match between the actual voice input and a
previously stored voice template, this method first
processes the voice input using "Fourier transforms"
or "linear predictive coding (LPC)", then attempts to
find characteristic similarities between the expected
inputs and the actual digitized voice input. These
similarities will be present for a wide range of
speakers, and so the system need not be trained by
each new user. The types of speech differences that
the speaker-independent method can deal with, but
which pattern matching would fail to handle, include
accents, and varying speed of delivery, pitch, volume,
and inflection. Speaker-independent speech
recognition has proven to be very difficult, with some
of the greatest hurdles being the variety of accents and
inflections used by speakers of different nationalities.
Recognition accuracy for speaker-independent
systems is somewhat less than for speaker-dependent
systems, usually between 90 and 95 percent. [12]
I have implemented template matching technique.
This approach has been intensively studied and is also
the back bone of most voice recognition products in
the market.
II. IMPLEMENTATION
A. Design Description
The voice recognition system using template
matching technique require the user to first create a
template for matching comparison by first recording
10 samples of the speaker’s voice by calling a phrase
which is going to be the known voice. Thereafter, the
questioned speaker’s voice can now be recorded
which would now be further analyzed using Discrete
Fourier Transform.
Discrete Fourier Transform: Voice recognition in
time domain would be extremely be impractical based
on the difficulties explained above. Instead an
analysis in frequency spectra in a voice which remain
predominately unchanged as speech is slightly varied
turn out to be a more viable option. The conversion of
all the recording into frequency domain is done using
discrete Fourier transform greatly simplified the
process of comparing two recordings. [3][6]
Finding the Norm: Due to the nature of human speech
all the data pertaining to frequency above 600Hz is
safely discarded. Therefore, once a recording is
converted into frequency domain, it could then be
simply regarded as a vector in 600-dimensional
Euclidean space. At this point, a comparison between
two vectors could easily be carried out by normalizing
the vectors (giving them length 1) then computing the
norm of the difference between the two (of course, the
difference between two vectors in R600 is performed
by subtracting component wise). Unfortunately,
exactly which norm to use is not immediately clear?
After carefully comparing and contrasting the use of
the Taxicab, Euclidean, and Maximum norms.[13]
It became clear that the Euclidean norm most
accurately measured the closeness between different
frequency spectra. Once the norm function was
chosen, all that remained was to decide exactly how
small the norm of the difference of two vectors had to
be in order to determine that both recordings
originated from the same person.
3. Voice Recognition System using Template Matching 15
www.ijorcs.org
Chebyshev's Inequality: Chebyshev’s inequality says
that at least 1-1/K2 of data from a sample must fall
within K standard deviations from the mean,
where K is any positive real number greater than one.
To illustrate the inequality, we will look at it for a few
values of K:
− For K = 2 we have 1 – 1/K2
= 1 - 1/4 = 3/4 = 75%.
So Chebyshev’s inequality says that at least 75%
of the data values of any distribution must be
within two standard deviations of the mean.
− For K = 3 we have 1 – 1/K2
= 1 - 1/9 = 8/9 = 89%.
So Chebyshev’s inequality says that at least 89%
of the data values of any distribution must be
within three standard deviations of the mean.
− For K = 4 we have 1 – 1/K2
= 1 - 1/16 = 15/16 =
93.75%. So Chebyshev’s inequality says that at
least 93.75% of the data values of any distribution
must be within four standard deviations of the
mean.[13]
Template Matching: The above analysis has revealed
that Chebyshe v's Inequality states that in particular,
at least 3/4 of all measurements from the same
population fall within two standard deviations of the
mean. Hence, in response to the problem posed at the
end of the previous paragraph, the following solution
can be formulated: By requiring that the norm of the
difference fall within 2 standard deviations of the
normal average voice, I have ensured that at least 75%
of the time, the algorithm would recognize a voice
correctly.
Figure 2: Detail Design Description
III. RESULTS
The performance rating of the voice recognition
technique adopted would recognize the speaker’s
voice 75% of the time of enrollment.
Figure3:Graph showing normalized frequency spectra of
recorded questioned voice sample
Figure4:Graph showing normalized frequency spectra ofaverage
templatevoice sample.
A. Performance Evaluation Index
The indexes well accepted to determine the
recognition rate of voice recognition system is
endpoint detection algorithm using Zero crossing
rates (ZCR) and Variable Frame Rates (VFR). This
techniques involves using a clean enrollment of
speech signal. The signal is recorded for 2seconds and
the testing speech is polluted by additive noise at
different noise decibel levels. The performance of the
four endpoint algorithm has been plotted in the figure
below. Three varieties of additive noise, babble noise,
and F-16 noise have been used to test. Table (1-3)
shows the accuracy rates. The additive noise has been
taken at different levels of 20dB, 15dB, 10dB, 5dB
and 0dB SNR.[15]
STEP 1
• Voice Sample Recording
STEP 2
• Voice Feature Extraction
STEP 3
• Discrete Fourier Transform
STEP 4
• Euclidean Norm
STEP 5
• Template Matching
4. 16 Luqman Gbadamosi
www.ijorcs.org
Figure5:Factory Noise
Figure6:Babble Noise
Figure7:F-6 Noise
Table 1: Endpoint Detection (Babble Noise)
Clean 20dB 15dB 10dB 5dB 0dB
VFR 98.0 98.6 96.0 84.3 62.0 26.6
Table 2: Endpoint Detection (Babble Noise)
Clean 20dB 15dB 10dB 5dB 0dB
VFR 98.7 98.6 97.0 83.3 65.0 30.0
Table 3: Endpoint Detection (F-16 Noise)
Clean 20dB 15dB 10dB 5dB 0dB
VFR 98.7 98.6 97.0 82.0 69.6 14.0
The experimental results above was derived from
speech data collected from speaker using the different
voice recognition algorithm. clean speech was
achieved when the effect background noise and
channel distortion are minimized.
The experimental results using comparative
analysis of different algorithm for voice recognition at
different noise levels has revealed that inaccurate
endpoint detection can cause misclassification rather
than other possible errors. The accuracy of endpoint
detection is much higher for the algorithm which
integrate both time domain and frequency domain.
This has actually proven beyond any reasonable doubt
that voice recognition system using template
matching still remain the best algorithm for
recognizing an unknown voice.
IV. CONCLUSION
The above research work implementation is an
effort to understand how voice recognition is used as
one of the best forms of biometric to recognize the
identity of human being. It briefly describe all the
stages from voice recording, voice feature extraction,
discrete Fourier transform to template matching which
generate a good percentage of matching score.
Various standard technique are used at the
intermediate stage of the processing.
Low percentage verification rate arise due to the
difficulty of developing algorithm to recognize human
voice as different data are obtained for voice samples
recorded on different occasions. New technique and
highly effective algorithm have been discovered
which gives better results.
Also a major challenge is the inability of the
technique to recognize a different word phrase aside
from the one stored in the database during enrollment.
The technique adopted only recognize human voice
70% of the time. It is highly recommended that future
research work should focus on achieving up 95%
recognition rate should recognize different word
phrase.
V. REFERENCES
[1] Kinnunen, Tomi; Li, Haizhou. "An overview of text-
independent speaker recognition: From features to
super vectors". Speech Communication 52 (1): 12–40.
doi:10.1016/j.specom.2009.08.009
[2] Homayoon Beigi, “Speaker Recognition, Biometrics /
Book 1, Jucheng Yang (ed.), Intech Open Access
Publisher, 2011, pp. 3-28, ISBN 978-953-307-618-8.
doi: 10.1007/978-0-387-77592-0
[3] Duhamel, P. and M. Vetterli, "Fast Fourier
Transforms: A Tutorial Review and a State of the Art,"
Signal Processing, Vol. 19, April 1990, pp. 259-299.
doi: 10.1016/0165-1684(90)90158-U
5. Voice Recognition System using Template Matching 17
www.ijorcs.org
[4] Oppenheim, A. V. and R. W. Schafer, “Discrete-Time
Signal Processing”, Prentice-Hall, 1989, p. 611.
[5] Oppenheim, A. V. and R. W. Schafer, Discrete-Time
Signal Processing, Prentice-Hall, 1989, p. 619.
[6] Rader, C. M., "Discrete Fourier Transforms when the
Number of Data Samples Is Prime," Proceedings of the
IEEE, Vol. 56, June 1968 (Current Version: June
2005), pp. 1107-1108. doi: 10.1109/PROC.1968.6477
[7] Oppenheim, A. V. and R.W. Schafer. Discrete-Time
Signal Processing, Englewood Cliffs, NJ: Prentice-
Hall, 1989, pp. 311-312.
[8] ITU-T Recommendation G.711, "Pulse Code
Modulation (PCM) of Voice Frequencies," General
Aspects of Digital Transmission Systems; Terminal
Equipments, International Telecommunication Union
(ITU), 1993.
[9] Beigi, Homayoon (2011). “ Fundamentals of Speaker
Recognition.”. [Online]. Available: http://
www.wikipedia.org/wiki/speaker_recognition.
[10] Course project (Fall 2009 ) “Voice Recognition Using
MATLAB”. California State University Northridge
during the semester. [Online]. Available:
http://www.cnx.org/content/m33347/1.3/module_expor
t?format=zip
[11] “Article on Human Voice” [Online]. Available:
http://www.wikipedia.org/wiki/Human voice.
[12] “Techniques of Voice Recognition System” [Online].
Available:http://www.hitl.washington.edu/scllw/EVE/I
.D.2.d.VoiceRecognition.htm
[13] “Probability Tutorials on Chebyshevs-Inequality”
[Online]. Available:
http://www.statistics.about.com/od/
probHelpandTutorials/a/Chebyshevs-Inequality.htm.
[14] Sangram Bana, Dr. Davinder Kaur, “Fingerprint
Recognition System using Image Segmentation”.
International Journal of Advanced Engineering
Sciences and technologies Vol No. 5, Issue No. 1, 012
– 023
[15] Kapil Sharma, H.P Sinha & R.K Aggarwal
“Comparative study of speech Recognition System
using various feature extraction techniques”.
International Journal of Information Technology and
Knowledge Management Vol 3, No2, pp. 695-698
How to cite
Luqman Gbadamosi, " Voice Recognition System using Template Matching ". International Journal of Research in
Computer Science, 3 (5): pp. 13-17, September 2013. doi: 10.7815/ijorcs. 35.2013.070