Vocabulary mining for information retrieval: rough sets and fuzzy sets

作者:

Highlights:

摘要

Vocabulary mining in information retrieval refers to the utilization of the domain vocabulary towards improving the user’s query. Most often queries posed to information retrieval systems are not optimal for retrieval purposes. Vocabulary mining allows one to generalize, specialize or perform other kinds of vocabulary-based transformations on the query in order to improve retrieval performance. This paper investigates a new framework for vocabulary mining that derives from the combination of rough sets and fuzzy sets. The framework allows one to use rough set-based approximations even when the documents and queries are described using weighted, i.e., fuzzy representations. The paper also explores the application of generalized rough sets and the variable precision models. The problem of coordination between multiple vocabulary views is also examined. Finally, a preliminary analysis of issues that arise when applying the proposed vocabulary mining framework to the Unified Medical Language System (a state-of-the-art vocabulary system) is presented. The proposed framework supports the systematic study and application of different vocabulary views in information retrieval.

论文关键词:Vocabulary mining,Generalized rough sets,Fuzzy sets,Multiple vocabulary views,UMLS

论文评审过程:Received 20 April 1999, Accepted 7 February 2000, Available online 6 December 2000.

论文官网地址:https://doi.org/10.1016/S0306-4573(00)00014-5