In this presentation, Santosh Raj has built a novel speech recognition system that can be used to control home appliances. This could very well be integrated with various IoT devices.
2. Speaker identification is the process of automatically
recognizing who is speaking based on unique characteristics
contained in speech signal. This technique makes it
possible to use the speaker's voice and the spoken word to
verify their identity and control access.
4. 1. Easier way to control home
appliances(controlling appliances using voice
command instead of remote of switches is
much convinient ).
2. Problem of forgetting password will be
resolved using aforementioned system(old age
people generally forgetting their password so
this should be useful tool for them).
5. 1950s and 1960s- Bell Laborotories
designed in 1952 the Audrey system which
recognized digits spoken by the single voice.
1970s- Carnegie Mellon’s “Harpy Speech
understanding system” which can understand
1011 words.
1980s- speech recognition turns towards
prediction.
2013- Apple designed SIRI voice recognition
system which turns to a benchmark till today.
6.
7. 1. It must be noise free.
2. Voiced signal should only be considered for
further processing.
3. Voiced signal will be distinguised with unvoiced
signal through threshold amplitude.
8. Linear prediction coding(LPC)- It is the feature
extraction technique and used representation of a
speech signal which will be generating low data
rate discreet signal.
• Voiced/unvoiced(1 byte)
• Pitch(6 byte)
• Voiced signal amplitude(11 byte)
• Unvoiced amplitude(5byte)
9. GAUSSIAN MIXTURE MODELS (GMM)
• A Gaussian Mixture Model (GMM) is a parametric probability
density function represented as a weighted sum of Gaussian
component densities.
• GMMs are commonly used as a parametric model of the
probability distribution of continuous measurements or
features in a biometric system, such as vocal-tract related
spectral features in a speaker recognition system
10. 1.Features extraction from the unknown voice
2.Calculate gmm parameter
3.Compare this with the existing model
4.Based on matching percentage identify the speaker
11. we were sucessful in assembling the blocks in
our project.
A basic speech recognition system which
recognises a digit from 0 to 9 was thus
developed.
The system was trained in an environment with
minimum ambient noise.
The system gives the most accurate results
when implemented in the environment where it
was trained.
12. We have taken samples of 50 speakers
uttering digits 0-9 each 20 times from the
reliable database.
The unknown voice is tested succesfully with
our existing model.
The identification rate is approx. 90%.
13. We intend to design a home automation
system which will be beased on the command
prompted by the authorised speaker. This will
enable us to control home appliances more
conveniently.
14. REFERENCES
[1] X.Huang, A. Acero, and H.-W. Hon, “Spoken Language Processing:
A Guide to Theory, Algorithm
and System Development”. Prentice Hall PTR May 2001
[2] Matthew Nicholas Stuttle, “A Gaussian Mixture Model Spectral
Representation for Speech
Recognition”. Hughes Hall and Cambridge University Engineering
Department. July 2003