ma52006id386

A Note on Spelling Correction Methods based upon Statistical Decision Theory Yasunari MAEDA (Kitami Institute of Technology) Hideki YOSHIDA (Kitami Institute of Technology) Yoshitaka FUJIWARA (Kitami Institute of Technology) Toshiyasu MATSUSHIMA (Waseda University)

Topics 1. Introduction 2. Definitions and Previous Research 3. Spelling Correction Methods based upon Statistical Decision Theory 3.1. Evaluating an Error Rate per Sentence 3.2. Evaluating an Error Rate per Word 4. Conclusion

1. Introduction decoding problem in coding theory spelling correction problem in natural language processing discrete memoryless channel discrete memoryless channel a string of codes ( codes + noises ) a string of words a string of alphabets ( words + ) a string of alphabets spelling misses

2. Definitions and Previous Research an alphabet a set of alphabets a word (a string of alphabets) a set of words a probability of an event that word occurs a probability of an event that word occurs next to word parameters true parameters (unknown) a probability of an event that alphabet is received as alphabet through DMC

2. Definitions and Previous Research data for learning and 　　　 ( are known) the number of data the th data in the number of words in the th sentence the th sentence in received string of alphabets when is transmitted the th word in , received string of alphabets when is transmitted the th alphabet in the th word length(the number of alphabets) of the th word

2. Definitions and Previous Research a sentence occurs a sentence is received as a string . . (1) (2)

2. Definitions and Previous Research the new sentence occurs is received as a string the th word in the th alphabet in a received alphabet when is transmitted . (4) . (3) a new sentence (unknown) a received string when the new sentence is transmitted (known)

2. Definitions and Previous Research spelling correction problem estimating the new sentence under the conditon that the learning data and the new received string are given

2. Definitions and Previous Research previous research spelling correction problem is divided into two problems. 1) estimating the unknown parameters 2) estimating the new sentence Maximum Likelihood Estimate(MLE) is used. There is no theoretical guarantee when the number of data for learning is finite. . (5)

3. Spelling Correction Methods based upon Statistical Decision Theory 1) Byaes optimal method 2) approximate method minimizes an error rate with reference to a Bayes criteiron when the number of data for learning is finite We treat spelling correction problem as one problem based upon statistical decision theory. reduces the computational complexity 2 types of error rates 3.1. Evaluating an error rate per sentence 3.2. Evaluating an error rate per word

3.1. Evaluating an Error Rate per Sentence Loss function where is a decision function which returns an estimate of risk function . (7) (6)

3.1. Evaluating an Error Rate per Sentence Bayes risk where are prior density functions for . Bayes optimal decision where , (8) , (9) The error rate per sentence is minimized with reference to the Bayes criterion.

3.1. Evaluating an Error Rate per Sentence A Direchlet distribution is used as the prior density for . are the numbers of times that is received as in . is the parameter of the Direchlet distribution for . , (10) where

3.1. Evaluating an Error Rate per Sentence depth Bayes optimal solution can be calculated using a DP(Dynamic Programming) method. DP-tree Each node represents a string of words. e.g. The Bayes optimal solution can be calculated by continuing calculation at each node from the depth of to .

3.1. Evaluating an Error Rate per Sentence The Bayes optimal solution can be calculated by continuing calculation at each node from the depth of to . calculation of each node at the depth of is expected probability of . The Bayes optimal solution can be calculated. The computational complexity of Bayes optimal solution is proportional to the number of nodes in the DP-tree. And it is an exponential order on . , (11) where

3.1. Evaluating an Error Rate per Sentence Approxomate method Predictive distributions calculated by using the posterior density are used as estimates of parameters. e.g. is the number of times that occurs next to in , is a parameter of the Direchlet distribution for . , (12) where

3.1. Evaluating an Error Rate per Sentence The approxomate method is equal to a Viterbi algorithm. e.g. time time time time time trellis diagram metric of time The approximate solution can be calculated by continuing calculation at each node from time to . . (13) . (14) The computational complexity is proportional to .

3.2. Evaluating an Error Rate per Word Loss function where is a decision function which returns an estimate of . Bayes optimal decision . (16) (15) The computational complexity is an exponential order on .

3.2. Evaluating an Error Rate per Word Approxomate method Predictive distributions calculated by using the posterior density are used as estimates of parameters. The approxomate method is equal to BCJR algorithm. . (18) (17) .

3.2. Evaluating an Error Rate per Word time time time time time time time approximate solution . (21) . (20) . (19) The computational complexity is proportional to .

4. Conclusion We studied the spelling correction problem based upon statistical decision theory. We studied two types of error rates. the error rate per sentence the error rate per word We proposed Bayes optimal methods which minimize an error rate with reference to the Bayes criterion. We also proposed approximate methods. As further works, we want to study properties of the proposed approximate methods. And we also want to apply statistical decision theory to other tasks in natural language processing and so on.

ma52006id386

More Related Content

What's hot

Viewers also liked

Similar to ma52006id386

More from matsushimalab

Recently uploaded

ma52006id386