3. Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to text. The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker.
10. Speech recognition techniques Analysis techniques are similar for speech and speaker recognition.The following are techniques in SR : Modal evaluation Text dependence Stochastic modals Vector quantization Cepstral analysis (High Recognition Accuracy) Orthogonal LPC parameters Neural network approaches
11. Speech Recognition Architecture The noisy channel model of individual words The noisy model channel applied to entire sentence
12. Speech Recognition Architecture The goal of the probabilistic noisy channel architecture for speech recognition can be summarized as follows : What is the most likely sentence out of all sentences in the language L given some acoustic input O ?
13.
14. Overview of HMMs Markov chains used “to model pronunciation”. Forward algorithm:Phonesequences likelihood. Real input is not symbolic: Spectral features Input symbols do not correspond to machine states. Note: Why HMMs are used in speech recognition is that a speech signal could be viewed as a piecewise stationary signal or a short-time stationary signal.
15. Speech Recognition Requirements To use speech recognition, you need the following: A high quality close-talk (headset) microphone with gain adjustment (gain adjustment: A microphone feature that allows your input to be amplified so that it is made louder for use by the system.) support (A universal serial bus (USB) microphone is recommended.) A 400 megahertz (MHz) or faster computer 128 MB or more of memory Windows 2000 with Service Pack 3 or Windows XP or later Microsoft Internet Explorer 5.01 or later
16.
17. Automatic Speech Recognition System for Home Appliances Control Abstract - In the present work we study the performance of a speech recognizer for the Greek language, in a smart-home environment. This recognizer operates in spoken interaction scenarios, where the users are able to control various home appliances. In contrast to command and control systems, in our application the users speak spontaneously, beyond the use of a standardized set of isolated commands. The operational performance was tested over various environmental conditions, for two different types of microphones.
19. Dialogue systems Dialogue systems play a key role in any kind of conversational spoken language interface. Intelligent interfaces of home appliances provide the means for facilitating the operation of these devices, within a dialogue system. Various systems for home appliance control have been reported in the literature, focusing on enhancing the performance of the speech recognition process
21. Architecture Explanation The audio signal from the user is captured and passed through a speech recognition module that produces a recognition hypothesis. This recognition hypothesis is then forwarded to a language understanding component that creates a corresponding semantic representation. This semantic input is then passed to the dialog manager, which, based on the current input and discourse context, produces the next system action (typically in the form of a semantic output). A language generation module then produces the corresponding surface (textual) form, which is subsequently passed to a speech synthesis module and rendered as audio back to the user.
22. Function of dialog manager The dialog manager therefore plays a key control role in any conversational spoken language interface: given the decoded semantic input corresponding to the current user utterance and the current discourse context, it determines the next system action. In essence, the dialog manager is responsible for planning and maintaining the coherence, over time of the conversation.
23. Steps a dialogue manager do First, the dialog manager must maintain a history of the discourse and use it to interpret the perceived semantic inputs in the current context. Second, a representation – either explicit or implicit – of the system task is typically required. The current semantic input, together with the current dialog state and information about the task to be performed is then used to determine the next system action.