Computationally Efficient Approximation of a Probabilistic Model for Document Representation in the WEBSOM Full-Text Analysis Method

作者:S. Kaski

摘要

WEBSOM is a recently developed neural method for exploring full-text document collections, for information retrieval, and for information filtering. In WEBSOM the full-text documents are encoded as vectors in a document space somewhat like in earlier information retrieval methods, but in WEBSOM the document space is formed in an unsupervised manner using the Self-Organizing Map algorithm. In this article the document representations the WEBSOM creates are shown to be computationally efficient approximations of the results of a certain probabilistic model. The probabilistic model incorporates information about the similarity of use of different words to take into account their semantic relations.

论文关键词:data mining, feature extraction, information retrieval, Self-Organizing Map (SOM), text analysis

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1009618125967