Models for retrieval with probabilistic indexing

作者:

Highlights:

摘要

In this article three retrieval models for probabilistic indexing are described along with evaluation results for each. First is the binary independence indexing (BII) model, which is a generalized version of the Maron and Kuhns indexing model. In this model, the indexing weight of a descriptor in a document is an estimate of the probability of relevance of this document with respect to queries using this descriptor. Second is the retrieval-with-probabilistic-indexing (RPI) model, which is suited to different kinds of probabilistic indexing. For that we assume that each indexing scheme has its own concept of “correctness” to which the probabilities relate. In addition to the probabilistic indexing weights, the RPI model provides the possibility of relevance weighting of search terms. A third model that is similar was proposed by Croft some years ago as an extension of the binary independence retrieval model but it can be shown that this model is not based on the probabilistic ranking principle. The probabilistic indexing weights required for any of these models can be provided by an application of the Darmstadt indexing approach (DIA) for indexing with descriptors from a controlled vocabulary. The experimental results show significant improvements over retrieval with binary indexing. Finally, suggestions are made regarding how the DIA can be applied to probabilistic indexing with free text terms.

论文关键词:

论文评审过程:Available online 19 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(89)90091-5