Using interesting sequences to interactively build Hidden Markov Models
作者:Szymon Jaroszewicz
摘要
The paper presents a method of interactive construction of global Hidden Markov Models (HMMs) based on local sequence patterns discovered in data. The method is based on finding interesting sequences whose frequency in the database differs from that predicted by the model. The patterns are then presented to the user who updates the model using their intelligence and their understanding of the modelled domain. It is demonstrated that such an approach leads to more understandable models than automated approaches. Two variants of the problem are considered: mining patterns occurring only at the beginning of sequences and mining patterns occurring at any position; both practically meaningful. For each variant, algorithms have been developed allowing for efficient discovery of all sequences with given minimum interestingness. Applications to modelling webpage visitors behavior and to modelling protein secondary structure are presented, validating the proposed approach.
论文关键词:Interesting pattern, Frequent sequence mining, Hidden Markov Model
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10618-010-0171-0