Probabilistic methods for ranking output documents in conventional Boolean retrieval systems

作者:

Highlights:

摘要

Current operational information retrieval systems based on Boolean searching could be radically improved through the incorporation of a weighting mechanism for ranking output documents. However, a number of previous attempts to refine conventional Boolean retrieval systems along these lines have not been fully successful because of their inherent inconsistencies and ambiguities. A detailed account of the research aimed at obtaining a more rigorous methodology is given in this article.The presented approach has been developed by extending well-known probabilistic output ranking methods that are applicable in retrieval systems in which document representations as well as search request formulations are simply sets of index terms. A series of experiments, carried out in recent years to verify some of these methods, have particularly demonstrated the value of a systematic statistical use of relevance feedback information. It is therefore expected that the application of the extended probabilistic document ranking methodology in conventional Boolean systems will also prove to be useful, and that considerable improvements in retrieval performance of these systems will be obtained.A theoretical framework used to derive the proposed output ranking scheme is described in detail. A simple illustrative example is included, followed by a thorough discussion of the suggested approach.

论文关键词:

论文评审过程:Available online 13 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(88)90095-7