HMM Parameters Estimation Using Hybrid Baum­                                             Welch Genetic Algorithm          ...
IV.    HMM TRAINING USING HYBRID GA-BW   •      B ={bij ( k )} is the observation probability matrix,                     ...
selection and crossover functions have been used                          III    the       1)     Apply B-W algorithm to g...
REFERENCES                                       [6]    K. Sun, J. Yu, "Video Affective Content Recognition Based on Genet...
Upcoming SlideShare
Loading in …5



Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. HMM Parameters Estimation Using Hybrid Baum­ Welch Genetic Algorithm Mourad Oudelha(l) , Raja N. Ainon(2) Software Engineering Department Faculty of Computer Science and Information Technology, University of Malaya Kuala Lumpur, Malaysia (l); (2) Automatic speech recognition (ASR) has been for a nature and applies crossover and mutation operations to thelong time an active research area with a large variety of encoded solutions to simulate biological genetic operations.applications in human-computer intelligent interaction. The most Crossover creates new generations by exchanging parentsused technique in this field is the Hidden Markov Models genes portions, in order to add variations to the solution.(HMMs) which is a statistical and extremely powerful method. Mutation changes only some points in the encoded solution andThe HMM model parameters are crucial information in HMM generates new traits in the offspring, which provide GA theASR process, and directly influence the recognition precision capacity to escape from a local optimum of the solution to asince they can make an excellent description of the speech signal. global one [8].Therefore optimizing HMM parameters is still an important andchallenging work in automatic speech recognition research area. Within the last decade much approaches have beenUsually the Baum-Welch (B-W) Algorithm is used to calculate proposed to apply GA for HMM training in speech recognitionthe HMM model parameters. However, the B-W algorithm uses [1], [2] and [8] and in other research areas like web informationan initial random guess of the parameters, therefore after extraction [7], molecular biology [5], biological sequenceconvergence the output tends to be close to this initial value of the analysis [3], [4], and video content analysis [6]. Thealgorithm, which is not necessarily the global optimum of the experimental results of HMM parameters estimation suggestmodel parameters. In this paper, a Genetic Algorithm (GA) that using GA or hybrid GA-BW achieve better results than B­combined with Baum-Welch (GA-BW) is proposed; the idea is to W or other heuristic algorithms.use GA exploration ability to obtain the optimal parameterswithin the solution space. In this paper two HMM training methods have been analyzed, the first is a MatLab toolbox based on B-W Keywords- Automatic speech recognition; Hidden Markov algorithm; the other is our implementation of GA usingModel; Genetic Algorithm; HMM training; Baum-Welch MatLab GA toolbox. In terms of global searching GA is highlyalgorithm. powerful method however it is very slow to converge. To overcome this limitation a hybrid GA-BW has been used. This 1. INTRODUCTION paper is organized as follows. In section (II) a description of The HMM has been used extensively in speech recognition HMM model parameters is made, followed by the B-W[2] to build Automatic Speech Recognitions (ASRs) systems. algorithm implementation in section (III), then the hybrid GA­The most important step in HMM based ASRs is HMM BW have been described in Section (IV), after thatparameters optimization, since getting optimal values of the experimental results related to both previous algorithms aremodel parameters will automatically increase the ASRs presented and discussed in Section (V), and finally therecognition accuracy. This procedure is referred to as HMM conclusion in Section (VI).training [8]; the speech occurrences are used to estimate themodel parameters. Usually, the well known Baum-Welch II. HMM TRANING SYSTEMSalgorithm is used to perform HMM training, an initial guess of The model parameters of an HMM can be expressed as a set ofthe parameters is made randomly, then more accurate three elements: )...= {A, B, n } [11]. Where:parameters are computed in each iteration until the algorithm • A= { aij } is the state transition probability matrix,converge. Using B-W algorithm, it is possible to get better each element aij represent the probability that theresults after every iteration [12], however this procedure can be HMM model will transit from state i to state j.biased toward the local maxima which is close to the initial Elements of matrix A must satisfY the next twovalues of the algorithm and fail to achieve global optima, conditions:which is necessary to produce the best recognition rate [2], [10]and [11]. Thus there is need for an algorithm that can escapefrom the local optimum and then probe the entire solution aij :::: 0 where 1::: i, j::: 3 (1)space to reach the global optimum of the model parameters. Unlike the B-W algorithm, GA can attain global optima by ,,,3 ai L j =o j = 1 where 1::: i::: 3 (2)searching in multiple points within the solution spacesimultaneously. GA imitates the phenomena of evolution in9 7 8-1-4244-6716- 7 /10/$26.00 ©2010 IEEE 542
  2. 2. IV. HMM TRAINING USING HYBRID GA-BW • B ={bij ( k )} is the observation probability matrix, ALGORITHM such that bij is the probability that the observation Ok In MixtGaussian toolbox implementation a 3 hidden state has been generated by state i. Elements of matrix B continuous density mixture Gaussian HMM with 105 must satisfy the next two conditions: observation symbols have been used, this configuration can well describe the speech utterance. Thus, the HMM model bij:::: 0 where 1:S i,j :s 3 (3) parameters is Ie = {A, B, n }, where A is a 3 by 3 transition probability matrix, B a 3 by 105 matrix, n is the initial 3 probability of states vector of size 3. "" bi L j =o j = 1 where 1:S i :s 3 (4) A. Encoding Method • n = {ni} is the initial state distribution vector, and every ni express the probability that the HMM chain It is vital to find a genetic representation of the model will start at state i. Elements of vector n must satisfy parameters before applying GA to solve the optimization the next two conditions: problem. In genetics, chromosomes are comprised by a set of basic elements called genes, in our case the elementary information is the elements of every probability matrixes A, B where 1:S I :s 3 (5) and n. We choose to concatenate the rows of each matrix in the model parameters Ie, thus the chromosomes will be represented (6) as an array of real numbers as shown in Fig. 2. HMM Training is the process of HMM parametercalculation. This is shown in figure 1, the training tools use thespeech data and their transcriptions to estimate the HMMparameters then the recognizer will use these HMMs to classifythe unknown speech utterances. Figure 2. Chromosome encoding B. Hybrid GA-BW We present a hybrid GA-BW algorithm due to slow convergence and high computational power needed by classical Figure 1. HMM training process [12] GA, especially when the generated chromosomes cannot satisfy the conditions (1), (2), ... ,(6). For this reason, not III. HMM TRAINING USING B-W ALGORITHM included in the offspring and they are replaced by new The B-W algorithm provided by MixtGaussian toolbox [9], chromosomes. To overcome the slow convergence problem,has been used to train the HMM. After an initial guess of the the initial generation P(O) is not generated randomly as inHMM parameters is made, the B-W algorithm is run for 20 normal GA, but by applying an iteration of B-W algorithm asiterations to get more accurate parameters. As result we get a shown in Figure 3. For every ten generations, three iterations ofcontinuous density mixture Gaussian HMM. Finally B-W algorithm are applied to all chromosomes to allow thetranscriptions of unknown speech utterances will be made by algorithm to converge faster.the recognizer module to determine how accurate are theHMMs parameters. 543
  3. 3. selection and crossover functions have been used III the 1) Apply B-W algorithm to generate the initial implementation. population P(O), where P(O) ={C], C2, ..., CN }, and Cj is one V. EXPERIMENTAL STUDY chromosome The experimental study consist of using the HMM model parameter resulting from both B-W and hybrid GA-BW 2) Calculate the fitness function F(Cj) of every training systems, to classify a set of unknown speech chromosome Cj within the actual population utterances. The speech database used in the study comes with MixtGaussian toolbox download file, the training data contains P(t). digits from 1 to 5, each digit is repeated by 15 speakers, and the testing data contains the same digit repeated by 10 speakers. 3) Select a few chromosomes for the The recognition software is implemented in the MixtGaussian intermediate population P(t). toolbox to. This test is repeated ten times as it is shown in table I. 4) Apply crossover to some chromosomes III P(t). TABLE I. RECOGNITION RATE FOR BOTH B- WAND GA-BW WITHIN 10 EXPERIENCES 5) Apply mutation to few chromosomes III Training Recognition rate (%) for experience number P(t). technique 1 2 3 4 5 6 7 8 9 10 B-W 68 76 76 72 68 76 68 84 72 64 6) Apply three iterations of B-W algorithm to GA-BW 76 64 88 72 84 76 72 80 76 72 the population P(t) for each ten generations. 7) From the results of table I it is difficult to statute about the t t+l; = quality of B-W and GA-BW algorithms in term of recognition if not convergence, then go to: 2). rate. However it is clear from results from table II that GA-BW algorithm performs better than the classical B-W algorithm. Figure 3. Hybrid GA-BW algorithm TABLE II. MINIMUM, MAXIMUM AND AVERAGE OF RECOGNITION RATE FOR BOTH B-W AND GA-BW ALGORITHMS Training Recognition rate (%)C. Fitness Function technique Minimum Maximum Average The fitness function is an evaluation mechanism of the B-W 64% 84% 72.67%chromosome; a higher fitness value reflects the chances of thechromosome to be chosen in the next generation. The log GA-BW 64 % 88% 76%likelihood [12] has been used, and it represents the probabilitythat the training observation utterances have been generated bythe current model parameters and it is a function of thefollowing form [2]: VI. CONCLUSION In this work two HMM training systems have been (7) analyzed. The first is based on the classical Baum-Welch iterative procedure; the second is based on genetic algorithm. The resulted HMM model parameters of each system has been used to classify the same unknown speech utterances with theD. Mutation same recognizer. Under the same conditions the hybrid GA­ Mutation selects randomly few chromosomes and alters BW algorithm performs better then Baum-Welch method. Wesome genes to produce new chromosomes. The believe that this improvement is due to global searching ability"mutationadaptfeasible" function of GA toolbox has been used of satisfy the conditions (1), (2) ... (6).E. Selection and Crossover Selection mimics the survival of fittest mechanism seen inNature. Chromosomes with higher fitness values have a greaterprobability to survive in the succeeding generations. Thensome elements from the population pool will be selected toapply crossover. Portions of genes will be exchanged betweeneach couple of chromosomes. The default GA Toolbox 544
  4. 4. REFERENCES [6] K. Sun, J. Yu, "Video Affective Content Recognition Based on Genetic Algorithm Combined HMM," Lecture Notes in Computer Science,[I] M. Shamsul Huda, J. Yearwood, G. Ranadhir, "A Hybrid Algorithm for Springer 200 7. Estimation of the Parameters of Hidden Markov Model based Acoustic [ 7] J. Xiao, L. Zou, C. Li, " Optimization of Hidden Markov Model by a Modeling of Speech Signals using Constraint-Based Genetic Algorithm Genetic Algorithm for Web Information Extraction," http://atlantis­ and Expectation Maximization," 6th IEEEIACIS International access date 22112/2009. Conference on Computer and Information Science (ICJS 2007), pp. 4 3 8- 4 4 3,2007. [ 8] J. Xiao, L. Zou, C. Li, "Optimization of Hidden Markov Model by a Genetic Algorithm for Web Information Extraction," http://atlantis­[2] S. Kwong, C.W. Chau, K.F. Man, K. S. Tang, "Optimization of HMM access date 22112/2009. topology and its model parameters by genetic algorithms," Pattern Recognition Society,V. 34, pp.S09-S22,2001. [9 ] R. Barbara, "A Tutorial for the Course Computational Intelligence," . http://www.igi.tugraz.atllehre/CI. 3I-01-201O.[ 3] K. Won, A. Prugel-Bennett, A. Krogh, "Training hmm structure with genetic algorithm for biological sequence analysis,", Bioinformatics [10] S. Kwong, C.W. Chau, "Analysis of Parallel Genetic Algorithms on 20(1 8),3613-3619,2004. HMM based speech recognition system," IEEE transactions on Consumer Electronics,Vol. 43, No. 4, 199 7.[4] K. Won, A. Prugel-Bennett, A. Krogh, "Evolving the Structure of Hidden Markov Models," IEEE TRANSACTIONS ON [II] Chau, C. W., S. Kwong, C. K. Diu, W. R. Fahinet, "Optimization of EVOLUTIONARY COMPUTATION, VOL. 10, NO. I, FEBRUARY HMM by a genetic algorithm," Proceedings ICASSP, pp. 1 727-17 30, 2006. 199 7.[S] L. Liu, H. Huo, B. Wang, "A Novel Optimization of Profile HMM by a [12] S. Young, D. Ollason, V. Valtchev, P. Woodland, The HTK Book (for Hybrid Genetic Algorithm," Journal of Information & Computational HTK Version 34.1), Cambridge University Engineering Department, Science,V.3S12 p. 734-7 41,Springer 200S. 2009. 545