A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams

作者:Ata Kabán, Mark A. Girolami

摘要

We propose a novel probabilistic method, based on latent variable models, for unsupervised topographic visualisation of dynamically evolving, coherent textual information. This can be seen as a complementary tool for topic detection and tracking applications. This is achieved by the exploitation of the a priori domain knowledge available, that there are relatively homogeneous temporal segments in the data stream. In a different manner from topographical techniques previously utilized for static text collections, the topography is an outcome of the coherence in time of the data stream in the proposed model. Simulation results on both toy-data settings and an actual application on Internet chat line discussion analysis is presented by way of demonstration.

论文关键词:topographic mapping, latent trait, hidden Markov model, topic segmentation, topic detection and tracking

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1013673310093