Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hidden Markov Model Toolkit (HTK) www.redicals.com

HTK is the “Hidden Markov Model Toolkit” developed by the Cambridge University Engineering Department. This toolkit aims at building and manipulating Hidden Markov Models (HMMs).
HTK is primarily used for speech recognition research although it has been used for numerous other applications including speech synthesis, character recognition and DNA sequencing. HTK consists of a set of library modules and tools available in C source form. It is available on free download, along with a complete documentation.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

  • Be the first to like this

Hidden Markov Model Toolkit (HTK) www.redicals.com

  1. 1. Hidden Markov Model Toolkit (HTK) Tutorial http://www.redicals.com What is HTK? HTK is the “Hidden Markov Model Toolkit” developed by the Cambridge University Engineering Department. This toolkit aims at building and manipulating Hidden Markov Models (HMMs). HTK is primarily used for speech recognition research although it has been used for numerous other applications including speech synthesis, character recognition and DNA sequencing. HTK consists of a set of library modules and tools available in C source form. It is available on free download, along with a complete documentation. 1.1 HTK Construction steps The main construction steps are the following: - Creation of a training database: Each element of the vocabulary is recorded several times, and labelled with the corresponding word. - Acoustical analysis: The training waveforms are converted into some series of coefficient vectors. – Definition of the models: A prototype of Hidden Markov Model (HMM) is defined for each element of the task vocabulary. - Training of the models: Each HMM is initialised and trained with the training data. - Definition of the task: The grammar of the recogniser (what can be recognised) is defined.
  2. 2. - Evaluation: The performance of the recogniser can be evaluated on a corpus of test data. How to Installing HTK on Windows STEP 1:- Register with HTK and DownloadHTK Toolkit. STEP 2:- Extract Windows HTK Binaries, Right-click the htk- 3.x-windows-binary.zip file, and select 'Extract All' from your right-click menu and follow the steps in the extraction wizard to extract the zip file to your HTK directory. STEP 3:-After Extraction, you will get two directories "bin.win32"and "bin", which contain HTK commands and other data-processing commands, respectively. STEP 4:- Now you Need to add these two directories to the system path STEP 5:- Go to Start >> type “cmd” Now type the address of this two directories Such as set path=%path%;c:htkbin.win32;c:htkbin
  3. 3. NOTE :- I have Created HTK Folder in C drive and have stored my both directories in it. Such as c:htkbin and c:htkbin.win32 STEP 5:- To test HTK toolkit you can call any Library such as HVite, Hcopy etc
  4. 4. How to Installing HTK on Linux STEP 1 :- DownloadHTK Toolkit and Extract it STEP 2 :- cd htk STEP 3 :- ./configure --prefix=/tmp –without-x –disable- hslab STEP 4 :- make all STEP 5 :- sudo make install STEP 6 :- To test HTK toolkit you can call any Library such as HVite, Hcopy etc through terminal. How to Installing HTK on Mac OSX STEP 1 :- DownloadHTK Toolkit and Extract it STEP 2 :- $tar zxf HTK-3.4.1.tar.gz STEP 3 :- Open Terminaland Type cd htk STEP 4 :- export CPPFLAGS=-UPHNALG STEP 5 :- cd htk chmod +x configure STEP 5 :- ./configure –without-x –disable-hslab STEP 6 :- make all STEP 7 :- sudo make install STEP 8 :- To test HTK toolkit you can call any Library such as HVite, Hcopy etc through termi
  5. 5. 1. Data creation HCopy: feature extraction HList: file information HLEd: label created(Master Label File, output the MLF) 2. Learning MakeProtoHMMSet: topologydetermine the initial model learning HInit: of HMM and the corresponding cut-out of the phoneme learning (Segmental K- Means) HRest:learning of HMM following the HInit (Baum-Welch, increase the number of mixtures) 3. Recognition HVite: recognitionby the Viterbi algorithm HBuild: generationof the word network (sub-networks can also be generated) HParse: conversionof grammar notation(EBNF ( extendedBackus notation) to) HDMan: dictionarymanagement tool 4. Analysis HResult: calculationof the recognitionrate HTK Tools HCopy This program will copy one or more data files to a designated output file, optionally converting the data into a parameterised form. While the source files can be in any supported format, the output format is always HTK. Hence, this program is used to convert data files in other formats to the HTK format, each source data file must have an associated label file, and a target label file is created. HCopy can also be used to convert the parameter kind of a file, for example from WAVEFORM to MFCC, depending on the configuration options. Conversions must be specified via a configuration. USE :- HCopy -C mfcc13.cfg ..waveFiles1001-10a.wav outputfeature001- 10a.fea HInit HInit is used to provide initial estimates for the parameters of a single HMM using a set of observation sequences. It works by repeatedly using Viterbi alignment to segment the training observations and then recomputing the parameters by pooling
  6. 6. the vectors in each segment. For mixture Gaussians, each vector in each segment is aligned with the component with the highest likelihood. HInit can be used to provide initial estimates of whole word models in which case the observation sequences are realisations of the corresponding vocabulary word. Alternatively, HInit can be used to generate initial estimates of HMMs for phoneme-based speech recognition. HLEd This program is a simple editor for manipulating label files. Typical examples of its use might be to merge a sequence of labels into a single composite label or to expand a set of labels into a context sensitive set. HLEd works by reading in a list of editing commands from an edit script file and then makes an edited copy of one or more label files. HSLab HSLab is an interactive label editor for manipulating speech label files. An example of using HSLab would be to load a sampled waveform file, determine the boundaries of the speech units of interest and assign labels to them. Alternatively, an existing label file can be loaded and edited by changing current label boundaries, deleting and creating new labels. Hparse The HParse program generates word level lattice files (for use with e.g. HVite) from a text file syntax description containing a set of rewrite rules based on extended Backus-Naur Form (EBNF). The EBNF rules are used to generate an internal representation of the corresponding finite-state network where HParse network nodes represent the words in the network, and are connected via sets of links. This HParse network is then converted to HTK word level lattice. HERest This program is used to perform a single re-estimation of the parameters of a set of HMMs, or linear transforms, using an embedded training version of the Baum-Welch algorithm. Training data consists of one or more utterances each of which has a transcription in the form of a standard label file (segment boundaries are ignored). For each training utterance, a composite model is effectively synthesised by concatenating the phoneme models given by the transcription. Each phone model has the same set of accumulators allocated to it as are used in HRest but in HERest they are updated simultaneously by performing a standard Baum-Welch pass over each training utterance using the composite model. HERest is intended to operate on HMMs with initial parameter values estimated by HInit/HRest. HERest supports multiple mixture Gaussians, discrete and tied-mixture HMMs, multiple data streams,
  7. 7. parameter tying within and between models, and full or diagonal covariance matrices. Hresults HResults is the HTK performance analysis tool. It reads in a set of label files (typically output from a recognition tool such as HVite) and compares them with the corresponding reference transcription files. For the analysis of speech recognition output. ------------------------ Overall Results -------------------------- SENT: %Correct=86.67 [H=52, S=8, N=60] WORD: %Corr=86.67, Acc=86.67 [H=52, D=0, S=8, I=0, N=60] H is the number of correct labels, D is the number of deletions, S is the number of substitutions, I is the number of insertions and N is the total number of labels in the defining transcription files. The percentage number of labels correctly recognised is given by and the accuracy is computed by

    Be the first to comment

    Login to see the comments

HTK is the “Hidden Markov Model Toolkit” developed by the Cambridge University Engineering Department. This toolkit aims at building and manipulating Hidden Markov Models (HMMs). HTK is primarily used for speech recognition research although it has been used for numerous other applications including speech synthesis, character recognition and DNA sequencing. HTK consists of a set of library modules and tools available in C source form. It is available on free download, along with a complete documentation.

Views

Total views

1,028

On Slideshare

0

From embeds

0

Number of embeds

6

Actions

Downloads

23

Shares

0

Comments

0

Likes

0

×