Information clustering based on fuzzy multisets

作者:

Highlights:

摘要

A fuzzy multiset model for information clustering is proposed with application to information retrieval on the World Wide Web. Noting that a search engine retrieves multiple occurrences of the same subjects with possibly different degrees of relevance, we observe that fuzzy multisets provide an appropriate model of information retrieval on the WWW. Information clustering which means both term clustering and document clustering is considered. Three methods of the hard c-means, fuzzy c-means, and an agglomerative method using cluster centers are proposed. Two distances between fuzzy multisets and algorithms for calculating cluster centers are defined. Theoretical properties concerning the clustering algorithms are studied. Illustrative examples are given to show how the algorithms work.

论文关键词:Information retrieval,Data clustering,Fuzzy multiset,Cluster center,Algorithm

论文评审过程:Available online 10 December 2002.

论文官网地址:https://doi.org/10.1016/S0306-4573(02)00047-X