Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system

作者:

Highlights:

摘要

The Internet, together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools. In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques. The tool allows to structure the set of documents in a hierarchical way (using a fuzzy hierarchical structure) and represent this structure in a graphical interface (a 3D sphere) over which the user can navigate.Gambal allows the analysis of the documents and the computation of their similarity not only on the basis of the syntactic similarity between words but also based on a dictionary (Wordnet 1.7) and latent semantics analysis.

论文关键词:Information retrieval,Hierarchical clustering,Fuzzy clustering

论文评审过程:Received 25 August 2003, Accepted 23 January 2004, Available online 5 March 2004.

论文官网地址:https://doi.org/10.1016/j.ipm.2004.01.001