Voice recognition system

Voice Recognition
Using MatLab
Presented by: Avienash raibole
Paresh Meshram
Vinayak kolpek

 INTRODUCTION
• The purpose of our project is to implement an
efficient voice recognition algorithm using
MatLab.
• Voice recognition is the process of converting
an acoustic signal, captured by a microphone
or a telephone, to a set of words.
• The recognised words can be an end in
themselves, as for applications such as
commands & control, data entry, and
document preparation.
• They can also serve as the input to further
linguistic processing in order to achieve
speech understanding.

 What we can do with voice
Recognition
• Transcription
– dictation, information retrieval
• Command and control
– data entry, device control, navigation, call routing
• Information access
– airline schedules, stock quotes, directory
assistance
• Problem solving
– travel planning, logistics

 Speaker Recognition methods
Text Dependent :
For speaker identity is based on his/ her
speaking one or more specific phase.
Text Independent:
Speaker models capture characteristics of
somebody’s speech which show up
irrespective of what one is saying.

 BLOCK DIAGRAM
Frame
Blocking Windowing FFT
Cepstrum
Mel-frequency
wrapping
Continuous
speech
frame
spectrum
mel
cepstrum

 Feature Extraction
• That extracts a small amount of data from the
voice signal that can later be used to
represent each speaker.
• A wide range of possibilities exist for
parametrically representing the speech signal
for the speaker recognition task, such as
a)Linear Prediction Coding(LPC),
b)Mel-Frequency Cepstrum Coefficients
(MFCC), and others.

 MFCC
• It is based on the known variation of the
human ear’s critical bandwidths with
frequency, filters spaced linearly at low
frequencies and logarithmically at high
frequencies.
• To capture the phonetically important
characteristics of speech, signal is expressed in
the Mel frequency scale .

 SIMPLE REPRESENTATION OF MFCC

 How does it work?
record extract
a voice feature
vectors
• Record voice command (Time domain).
• Transform into frequency domain using
Fourier Transform and get the magnitude
spectrum.
• Compare spectrum of voice commands.
Digitized
Speech
Signal
(.wave
file)
Acoustic
Preprocessing
(DFT + MFCC)
Speech
Recognizer
(Dynamic Time
Warping)

 Applications
• Controlling of device.
• Hands-free mobile phone in car.
• Single purpose command and control system.
• Voice Verification.
• Many more.

 Advantages
• The model is trained much faster than other
method.
• It is able to reduce large datasets to a smaller
number of codebook vectors.
• Easy to implementation and more accurate.
• Speech is a very natural way to interact, and it
is not necessary to sit at a keyboard or work
with a remote control.
• No training required for users.

 Limitations
• The amount of words that could be recognized
by our program was limited, the more words
we tried adding, the less accurate it became.
• The voice recognition program only works for
the person’s voice who is trained for it.
• Program is less accurate in noisy
environments.
• Voice Recognition works best if the
microphone is close to the user.

 Future Of Voice Recognition
• Better rejection of extraneous speech.
• Better recognition of embedded commands.
• Better efficiency on low cost processors.
• Standards for performance evaluation.
• Increased portability.
• Lower error rates.
• Improve overall robustness.

 Research Articles on Speech
Recognition
• Koester, H.H. (2006). Factors that Influence the Performance
of Experienced Speech Recognition Users. Assistive
Technology, 18(1): 56-76.
• Koester, H.H. (2004). Usage, Performance, and Satisfaction
Outcomes for Experienced Users of Speech Recognition.
Journal of Rehabilitation Research and Development, 41(5):
739-754.
• Koester, H.H. (2003). Abandonment of Speech Recognition
Systems. by New Users. Proceedings of RESNA 2003 Annual
Conference, Atlanta, GA. Arlington, VA: RESNA Press.
• Koester, H.H. (2002). User Performance with Speech
Recognition Systems: A Literature Review. Assistive
Technology, 13(2):116-30.

Voice recognition system

More Related Content

What's hot

Viewers also liked

Similar to Voice recognition system

Recently uploaded

Voice recognition system