SlideShare a Scribd company logo
1 of 10
VOICE COMMAND DEVICE
USING SPEECH
RECOGNITION
A voice command program(VCP) is controlled by means of the
human voice. By removing the need to use buttons, user can
easily operate his/her system with their hands full or while doing
other tasks. It is also capable of responding to several
commands at once.
It can understand around 50 different commands. VCPs can be
found in computer operating systems, commercial software for
computers, mobile phones, cars, call centers, and internet
search engines such as Google
Voice command devices are becoming more widely
available, and innovative ways for using the human voice
are always being created
Speech recognition systems can be categorized by many parameters
1.SPEAKER INDEPENDENCE
Some SR systems use "speaker independent speech recognition" while others
use "training" where an individual speaker reads sections of text into the SR
system. These systems analyze the person's specific voice and use it to fine
tune the recognition of that person's speech, resulting in more accurate
transcription. Systems that do not use training are called "speaker independent"
systems. Systems that use training are called "speaker dependent" systems.
2 An isolated-word (Discrete) speech recognition system requires
that the speaker pauses briefly between words, whereas a
continuous speech recognition system does not.
3. Spontaneous, speech contains disfluencies, periods of pause
and restart, and is much more difficult to recognise than speech
read from script
There are various algorithms implemented for speech recognition
1.HMM-A hidden Markov model (HMM) is a statistical markov model in which the
system being modeled is assumed to be a markov process with unobserved
(hidden) states. It is closely related to an earlier work on optimal nonlinear filtering
problem (stochastic processes). Hidden Markov models are especially known for
their application in temporal pattern recognition such as speech, handwriting,
gesture recognition, part-of-speech tagging, musical score following, partial
discharges and bioinformatics.
2.ARTIFICIAL NEURAL NETWORKS-The inspiration for neural networks came
from examination of central nervous systems. In an artificial neural network,
simple artificial nodes, called "neurons", "neurodes", "processing elements" or
"units", are connected together to form a network which mimics a biological neural
network.There is no single formal definition of what an artificial neural network is.
Commonly, though, a class of statistical models will be called "neural" if
theyconsist of sets of adaptive weights, i.e. numerical parameters that are tuned
by a learning algorithm, andare capable of approximating non-linear functions of
their inputs.The adaptive weights are conceptually connection strengths between
neurons, which are activated during training and prediction.
Project Perspective
The purpose of voice command application is
manifold.
•It Increases productivity by facilitating multi tasking.
•It can help people who have trouble using their hand.
•It can help people who have cognitive disabilities.
•It is always more convenient to speak than write.
We have added to this application by enabling dynamic
voice commands and can feature total control on the
system.
MICROSOFT SAPI
The Speech Application Programming Interface or SAPI is an API developed by
Microsoft to allow the use of speech recognition and speech synthesis within
Windows applications. To date, a number of versions of the API have been
released, which have shipped either as part of a Speech SDK, or as part of the
Windows OS itself. Applications that use SAPI include Microsoft Office,
Microsoft Agent and Microsoft Speech Server
RESULTS
KEY ISSUES IN SPEECH RECOGNITION
1.Noise
Speech is uttered in an environment of sounds, a clock ticking, a computer humming,
a radio playing somewhere down the corridor, another human speaker in the
background etc. This is usually called noise, i.e., unwanted information in the speech
signal. In ASR we have to identify and filter out these noises from the speech signal.
Another kind of noise is the echo effect, which is the speech signal bounced
2. Continuous speech
Speech has no natural pauses between the word boundaries, the pauses mainly appear
on a syntactic level, such as after a phrase or a sentence. This introduces a difficult
problem for speech recognition — how should we translate a waveform into a
sequence of words? After a first stage of recognition into phones and phone
categories, we have to group them into words. Even if we disregard word boundary
ambiguity

More Related Content

Similar to Dilpreetanshika major project

Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech RecognitionThejus Joby
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...IRJET Journal
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...IRJET Journal
 
Developing a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice commandDeveloping a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice commandMohammad Liton Hossain
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areasLearnbay Datascience
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2EXAMCELLH4
 
Iaetsd robo control sytsem design using arm
Iaetsd robo control sytsem design using armIaetsd robo control sytsem design using arm
Iaetsd robo control sytsem design using armIaetsd Iaetsd
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONijistjournal
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?CarterRodriguez6
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationIDES Editor
 

Similar to Dilpreetanshika major project (20)

Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech Recognition
 
FINAL report
FINAL reportFINAL report
FINAL report
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...
 
VOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEMVOICE RECOGNITION SYSTEM
VOICE RECOGNITION SYSTEM
 
Developing a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice commandDeveloping a hands-free interface to operate a Computer using voice command
Developing a hands-free interface to operate a Computer using voice command
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areas
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
BTP paper
BTP paperBTP paper
BTP paper
 
Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2
 
Iaetsd robo control sytsem design using arm
Iaetsd robo control sytsem design using armIaetsd robo control sytsem design using arm
Iaetsd robo control sytsem design using arm
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
 
Ijetcas14 426
Ijetcas14 426Ijetcas14 426
Ijetcas14 426
 

Dilpreetanshika major project

  • 1. VOICE COMMAND DEVICE USING SPEECH RECOGNITION
  • 2. A voice command program(VCP) is controlled by means of the human voice. By removing the need to use buttons, user can easily operate his/her system with their hands full or while doing other tasks. It is also capable of responding to several commands at once. It can understand around 50 different commands. VCPs can be found in computer operating systems, commercial software for computers, mobile phones, cars, call centers, and internet search engines such as Google Voice command devices are becoming more widely available, and innovative ways for using the human voice are always being created
  • 3. Speech recognition systems can be categorized by many parameters 1.SPEAKER INDEPENDENCE Some SR systems use "speaker independent speech recognition" while others use "training" where an individual speaker reads sections of text into the SR system. These systems analyze the person's specific voice and use it to fine tune the recognition of that person's speech, resulting in more accurate transcription. Systems that do not use training are called "speaker independent" systems. Systems that use training are called "speaker dependent" systems. 2 An isolated-word (Discrete) speech recognition system requires that the speaker pauses briefly between words, whereas a continuous speech recognition system does not. 3. Spontaneous, speech contains disfluencies, periods of pause and restart, and is much more difficult to recognise than speech read from script
  • 4. There are various algorithms implemented for speech recognition 1.HMM-A hidden Markov model (HMM) is a statistical markov model in which the system being modeled is assumed to be a markov process with unobserved (hidden) states. It is closely related to an earlier work on optimal nonlinear filtering problem (stochastic processes). Hidden Markov models are especially known for their application in temporal pattern recognition such as speech, handwriting, gesture recognition, part-of-speech tagging, musical score following, partial discharges and bioinformatics. 2.ARTIFICIAL NEURAL NETWORKS-The inspiration for neural networks came from examination of central nervous systems. In an artificial neural network, simple artificial nodes, called "neurons", "neurodes", "processing elements" or "units", are connected together to form a network which mimics a biological neural network.There is no single formal definition of what an artificial neural network is. Commonly, though, a class of statistical models will be called "neural" if theyconsist of sets of adaptive weights, i.e. numerical parameters that are tuned by a learning algorithm, andare capable of approximating non-linear functions of their inputs.The adaptive weights are conceptually connection strengths between neurons, which are activated during training and prediction.
  • 5. Project Perspective The purpose of voice command application is manifold. •It Increases productivity by facilitating multi tasking. •It can help people who have trouble using their hand. •It can help people who have cognitive disabilities. •It is always more convenient to speak than write. We have added to this application by enabling dynamic voice commands and can feature total control on the system.
  • 6. MICROSOFT SAPI The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of versions of the API have been released, which have shipped either as part of a Speech SDK, or as part of the Windows OS itself. Applications that use SAPI include Microsoft Office, Microsoft Agent and Microsoft Speech Server
  • 7.
  • 9.
  • 10. KEY ISSUES IN SPEECH RECOGNITION 1.Noise Speech is uttered in an environment of sounds, a clock ticking, a computer humming, a radio playing somewhere down the corridor, another human speaker in the background etc. This is usually called noise, i.e., unwanted information in the speech signal. In ASR we have to identify and filter out these noises from the speech signal. Another kind of noise is the echo effect, which is the speech signal bounced 2. Continuous speech Speech has no natural pauses between the word boundaries, the pauses mainly appear on a syntactic level, such as after a phrase or a sentence. This introduces a difficult problem for speech recognition — how should we translate a waveform into a sequence of words? After a first stage of recognition into phones and phone categories, we have to group them into words. Even if we disregard word boundary ambiguity