Text Retrieval Using Self-Organized Document Maps

作者:Krista Lagus

摘要

A map of text documents arranged using the Self-Organizing Map (SOM) algorithm (1) is organized in a meaningful manner so that items with similar content appear at nearby locations of the 2-dimensional map display, and (2) clusters the data, resulting in an approximate model of the data distribution in the high-dimensional document space. This article describes how a document map that is automatically organized for browsing and visualization can be successfully utilized also in speeding up document retrieval. Furthermore, experiments on the well-known CISI collection [3] show significantly improved performance compared to Salton's vector space model, measured by average precision (AP) when retrieving a small, fixed number of best documents. Regarding comparison with Latent Semantic Indexing the results are inconclusive.

论文关键词:document maps, information retrieval, LSI, SOM, text mining

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1013853012954