Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms

作者:

Highlights:

摘要

This article presents an overview of Probabilistic Automata (PA) and discrete Hidden Markov Models (HMMs), and aims at clarifying the links between them. The first part of this work concentrates on probability distributions generated by these models. Necessary and sufficient conditions for an automaton to define a probabilistic language are detailed. It is proved that probabilistic deterministic automata (PDFA) form a proper subclass of probabilistic non-deterministic automata (PNFA). Two families of equivalent models are described next. On one hand, HMMs and PNFA with no final probabilities generate distributions over complete finite prefix-free sets. On the other hand, HMMs with final probabilities and probabilistic automata generate distributions over strings of finite length. The second part of this article presents several learning models, which formalize the problem of PA induction or, equivalently, the problem of HMM topology induction and parameter estimation. These learning models include the PAC and identification with probability 1 frameworks. Links with Bayesian learning are also discussed. The last part of this article presents an overview of induction algorithms for PA or HMMs using state merging, state splitting, parameter pruning and error-correcting techniques.

论文关键词:Probabilistic automata,Hidden Markov models,Grammar induction,PAC learning,Bayesian learning,Induction algorithms,HMM topology learning

论文评审过程:Received 17 March 2004, Accepted 17 March 2004, Available online 13 March 2005.

论文官网地址:https://doi.org/10.1016/j.patcog.2004.03.020