Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

作者:David Picó, Francisco Casacuberta


Formal translations constitute a suitable framework for dealing with many problems in pattern recognition and computational linguistics. The application of formal transducers to these areas requires a stochastic extension for dealing with noisy, distorted patterns with high variability. In this paper, some estimation criteria are proposed and developed for the parameter estimation of regular syntax-directed translation schemata. These criteria are: maximum likelihood estimation, minimum conditional entropy estimation and conditional maximum likelihood estimation. The last two criteria were proposed in order to deal with situations when training data is sparse. These criteria take into account the possibility of ambiguity in the translations: i.e., there can be different output strings for a single input string. In this case, the final goal of the stochastic framework is to find the highest probability translation of a given input string. These criteria were tested on a translation task which has a high degree of ambiguity.

论文关键词:stochastic finite-state transducers, probabilistic estimation

