Speech Recognition
Speech Recognition
Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to text.The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker.
Applications of Speech RecognitionSpeech recognition applications include Voice dialing (e.g., "Call home"),
Call routing (e.g., "I would like to make a collect call"),
Simple data entry (e.g., entering a credit card number),
Preparation of structured documents (e.g., A radiology report),
Speech-to-text processing (e.g., word processors or emails), and
In aircraft cockpits (usually termed Direct Voice Input).Cont.,Automatic translation; Automotive speech recognition (e.g., Ford Sync); Telematics (e.g. Vehicle Navigation Systems); Court reporting (Realtime Voice Writing); Hands-free computing: voice command recognition computer user interface; Home automation; Interactive voice response; Mobile telephony, including mobile email; Multimodal interaction; Pronunciation evaluation in computer-aided language learning applications; Robotics; Video games, with Tom Clancy's EndWar and Lifeline as working examples; Transcription(digital speech-to-text); Speech-to-text (transcription of speech into mobile text messages); Air Traffic Control Speech Recognition
Speech recognition techniquesAnalysis techniques are similar for speech and speaker recognition.The following are techniques in SR :Modal evaluationText dependenceStochastic modalsVector quantizationCepstral analysis (High Recognition Accuracy)Orthogonal LPC parametersNeural network approaches
Speech Recognition ArchitectureThe noisy channel model of individual wordsThe noisy model channel applied to entire sentence
Speech Recognition ArchitectureThe goal of the probabilistic noisy channel architecture for speech recognition can be summarized as follows :    What is the most likely sentence out of all sentences in the language L given some acoustic input O ?
Speech Recognition ArchitectureThree stage for speech recognition system    Signal processing or Feature extraction stage :Waveform is sliced up into frames.Waveform are transformed into spectral features.Subwordor Phone recognition stage :Recognize individual speech.    Decoding stage :Find the sequence of words that most probably generated the input
Overview of HMMsMarkov chains used “to model pronunciation”.Forward algorithm:Phonesequences likelihood.Real input is not symbolic: Spectral featuresInput symbols do not correspond to machine states.Note: Why HMMs are used in speech recognition is that a speech signal could be viewed as a piecewise stationary signal or a short-time stationary signal.
Speech Recognition RequirementsTo use speech recognition, you need the following:A high quality close-talk (headset) microphone with gain adjustment (gain adjustment: A microphone feature that allows your input to be amplified so that it is made louder for use by the system.) support (A universal serial bus (USB) microphone is recommended.)A 400 megahertz (MHz) or faster computer128 MB or more of memoryWindows 2000 with Service Pack 3 or Windows XP or laterMicrosoft Internet Explorer 5.01 or later
Automatic Speech Recognition System for Home Appliances Control     Abstract - In the present work we study the performance of a speech recognizer for the Greek language, in a smart-home environment. This recognizer operates in spoken interaction scenarios, where the users are able to control various home appliances. In contrast to command and control systems, in our application the users speak spontaneously, beyond the use of a standardized set of isolated commands. The operational performance was tested over various environmental conditions, for two different types of microphones.
Different Home Appliances Control Scenarios
Dialogue systemsDialogue systems play a key role in any kind of conversational spoken language interface.Intelligent interfaces of home appliances provide the means for facilitating the operation of these devices, within a dialogue system. Various systems for home appliance control have been reported in the literature, focusing on enhancing the performance of the speech recognition process

Speech Recognition

  • 1.
  • 2.
  • 3.
    Speech recognition (alsoknown as automatic speech recognition or computer speech recognition) converts spoken words to text.The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker.
  • 4.
    Applications of SpeechRecognitionSpeech recognition applications include Voice dialing (e.g., "Call home"),
  • 5.
    Call routing (e.g.,"I would like to make a collect call"),
  • 6.
    Simple data entry(e.g., entering a credit card number),
  • 7.
    Preparation of structureddocuments (e.g., A radiology report),
  • 8.
    Speech-to-text processing (e.g.,word processors or emails), and
  • 9.
    In aircraft cockpits(usually termed Direct Voice Input).Cont.,Automatic translation; Automotive speech recognition (e.g., Ford Sync); Telematics (e.g. Vehicle Navigation Systems); Court reporting (Realtime Voice Writing); Hands-free computing: voice command recognition computer user interface; Home automation; Interactive voice response; Mobile telephony, including mobile email; Multimodal interaction; Pronunciation evaluation in computer-aided language learning applications; Robotics; Video games, with Tom Clancy's EndWar and Lifeline as working examples; Transcription(digital speech-to-text); Speech-to-text (transcription of speech into mobile text messages); Air Traffic Control Speech Recognition
  • 10.
    Speech recognition techniquesAnalysistechniques are similar for speech and speaker recognition.The following are techniques in SR :Modal evaluationText dependenceStochastic modalsVector quantizationCepstral analysis (High Recognition Accuracy)Orthogonal LPC parametersNeural network approaches
  • 11.
    Speech Recognition ArchitectureThenoisy channel model of individual wordsThe noisy model channel applied to entire sentence
  • 12.
    Speech Recognition ArchitectureThegoal of the probabilistic noisy channel architecture for speech recognition can be summarized as follows : What is the most likely sentence out of all sentences in the language L given some acoustic input O ?
  • 13.
    Speech Recognition ArchitectureThreestage for speech recognition system Signal processing or Feature extraction stage :Waveform is sliced up into frames.Waveform are transformed into spectral features.Subwordor Phone recognition stage :Recognize individual speech. Decoding stage :Find the sequence of words that most probably generated the input
  • 14.
    Overview of HMMsMarkovchains used “to model pronunciation”.Forward algorithm:Phonesequences likelihood.Real input is not symbolic: Spectral featuresInput symbols do not correspond to machine states.Note: Why HMMs are used in speech recognition is that a speech signal could be viewed as a piecewise stationary signal or a short-time stationary signal.
  • 15.
    Speech Recognition RequirementsTouse speech recognition, you need the following:A high quality close-talk (headset) microphone with gain adjustment (gain adjustment: A microphone feature that allows your input to be amplified so that it is made louder for use by the system.) support (A universal serial bus (USB) microphone is recommended.)A 400 megahertz (MHz) or faster computer128 MB or more of memoryWindows 2000 with Service Pack 3 or Windows XP or laterMicrosoft Internet Explorer 5.01 or later
  • 17.
    Automatic Speech RecognitionSystem for Home Appliances Control Abstract - In the present work we study the performance of a speech recognizer for the Greek language, in a smart-home environment. This recognizer operates in spoken interaction scenarios, where the users are able to control various home appliances. In contrast to command and control systems, in our application the users speak spontaneously, beyond the use of a standardized set of isolated commands. The operational performance was tested over various environmental conditions, for two different types of microphones.
  • 18.
    Different Home AppliancesControl Scenarios
  • 19.
    Dialogue systemsDialogue systemsplay a key role in any kind of conversational spoken language interface.Intelligent interfaces of home appliances provide the means for facilitating the operation of these devices, within a dialogue system. Various systems for home appliance control have been reported in the literature, focusing on enhancing the performance of the speech recognition process