The following content gives a brief detail about the topic
INTRODUCTION
VOICE RECOGNITION
TYPES OF VOICE RECOGNITION SYSTEMS
CORRELATION
PROGRAM
PROGRAM EXPLANATION
OUTPUTS
INFERENCE
REFERENCE
2. AGENDA
INTRODUCTION
VOICE RECOGNITION
TYPES OF VOICE RECOGNITION SYSTEMS
CORRELATION
PROGRAM
PROGRAM EXPLANATION
OUTPUTS
INFERENCE
REFERENCE
3. INTRODUCTION
Communication technology continues to evolve at a rapid
pace, and, as voice recognition is one of the part of it.
It can be safely stated that voice recognition is becoming a
feature of great importance in the life of the humankind.
It is one of the basic components of security systems,
automated systems, voice assistant software, etc.
As the technology advances, there is a need of creating things
that do more with less, and the present projects fits in this
definition, performing the voice recognition based of one
simple function, the correlation function.
4. VOICE RECOGNITION
With referred to as speech recognition, voice recognition is a
computer software program or hardware device with the
ability to decode the human voice.
In a more technical way, voice or speech recognition can be
defined as an interdisciplinary subfield of computer science
and computational linguistics that develops methodologies
and technologies that enable the recognition and translation
of spoken language into text by computers.
It is also known as automatic speech recognition (ASR),
computer speech recognition or speech to text (STT). It
incorporates knowledge and research in the computer
science, linguistics and computer engineering fields.
5. TYPES OF VOICE RECOGNITION
SYSTEMS
Speaker dependent system - The voice recognition requires
training before it can be used, which requires you to read a
series of words and phrases.
Speaker independent system - The voice recognition software
recognizes most users' voices with no training.
Discrete speech recognition - The user must pause between
each word so that the speech recognition can identify each
separate word.
Continuous speech recognition - The voice recognition can
understand a normal rate of speaking.
Natural language - The speech recognition not only can
understand the voice, but can also return answers to questions
or other queries that are being asked.
6. CORRELATION
Correlation is a measure of similarity between two signals,
i.e., indicates the measure up to which the given signal
resembles another signal.
Correlation is a mathematical operation that is very similar to
convolution. Just as with convolution, correlation uses two
signals to produce a third signal. (Smith, 1998)
There are two types of convolution: Autocorrelation and
cross-correlation.
11. PROGRAM EXPLANATION
In order to start running the code, the user as to insert the
command {speechrecognition('test.wav')} in the command
window.
Before pressing enter, the user should choose which one of
the test files we would like to try (the audio file’s name are:
test, test2 and test3, with the first ones granting access, and
the latter denying access).
After doing what is above stated and then pressing enter, the
code will start to run.
By using the command audioread, the code reads data from
the file named between the parenthesis, and returns sampled
data. This will give the program the necessary information to
start an autocorrelation.
12. PROGRAM EXPLANATION
After doing so, the program will do the similar process with
the 5 audio samples that are designed to have access. Instead
of an autocorrelation, it will perform a cross correlation
between each of the 5 audio samples and the input signal.
Within this process, some other tasks are completed. The
program will extract the peak value of each of the cross-
corelations for future use and also plot the waveforms of each
of the resulting correlations done.
After that, a vector shall be formed with each of the peak
value of the audio samples, and the maximum value among
the peak values will be extracted.
13. PROGRAM EXPLANATION
Following the above-mentioned process, the program will
enter in its final stage, the comparison and decision stage.
The program will compare the maximum peak value of the
vector from with the peak value of each of the cross-
correlations.
If there is a match with one of them, the program will grant
access, and reproduce a sound file conveying such
information with the use of the command soundsc, if not it
will deny access, and also reproduce a sound file conveying
such information
19. INFERENCE
This assignment defines us successfully about various
features, behavior and characteristics of speech signals and
also deals with the concept of cross correlation.
In this assignment, an algorithm has been created with the
help of MATLAB programming which requires .wav format
speech input signals where comparison with the test sound
file using correlation technique takes place.
Thus, assignment concludes that in order to remove the
further limitation of audio formats there is a requirement for
the study of various formats of speech signals which will be
further used for communication with the machines which
include the hardware part and not the simulator.
20. REFERENCE
S. W. Smith (1998, March 1). The Scientist & Engineer's
Guide to Digital Signal Processing (1st edition), Chapter 7.
Retrieved form: https://www.dspguide.com/ch7/3.htm
[November 8, 2020]
Voice Recognition. Retrieved November 8, 2020 from:
https://www.computerhope.com/jargon/v/voicreco.htm
Speech Recognition in MATLAB using correlation (2016,
February 11) Retrieved August 20, 2020 from:
https://www.youtube.com/watch?v=a4QHJmfp6q0&t=498s