Sphinx4 is a pure Java speech recognition library.
Sphinx4 is a set of classes which use Java Speech
API(JSAPI) as speech recognition engine
Four attributes to be set up for a speech recognition
job:
i. Acoustic model
ii. Dictionary
iii. Grammar/Language model
iv. Source of speech
First three attributes are setup using Configuration
object which is passed then to a recognizer
Acoustic model is used to represent relationship
between an audio signal and phonetic units in the
language.
A statistical language model is a probability
distribution over sequences of words.
Grammars allow to specify possible inputs very
precisely.
Grammars could be created with JSGF format
and usually have extension like .gram or .jsgf.
Important Sphinx packages used in a program:
edu.cmu.sphinx.frontend.util-Provides classes
that are generally useful to the various frontend
classes.
edu.cmu.sphinx.recognizer-Provides a set of
high level classes and interfaces that are used to
perform speech recognition with the Sphinx-4 speech
recognition system.
edu.cmu.sphinx.result-Provides a set of
classes that represent the result of a recognition.
edu.cmu.sphinx.util.props-Provides a
mechanism for managing persistent configuration
data.
Architecture of Sphinx4

Sphinx4

  • 2.
    Sphinx4 is apure Java speech recognition library. Sphinx4 is a set of classes which use Java Speech API(JSAPI) as speech recognition engine Four attributes to be set up for a speech recognition job: i. Acoustic model ii. Dictionary iii. Grammar/Language model iv. Source of speech First three attributes are setup using Configuration object which is passed then to a recognizer
  • 3.
    Acoustic model isused to represent relationship between an audio signal and phonetic units in the language. A statistical language model is a probability distribution over sequences of words. Grammars allow to specify possible inputs very precisely. Grammars could be created with JSGF format and usually have extension like .gram or .jsgf.
  • 4.
    Important Sphinx packagesused in a program: edu.cmu.sphinx.frontend.util-Provides classes that are generally useful to the various frontend classes. edu.cmu.sphinx.recognizer-Provides a set of high level classes and interfaces that are used to perform speech recognition with the Sphinx-4 speech recognition system. edu.cmu.sphinx.result-Provides a set of classes that represent the result of a recognition. edu.cmu.sphinx.util.props-Provides a mechanism for managing persistent configuration data.
  • 5.