An information measure of retrieval performance

作者:

Highlights:

摘要

The classical measures of information retrieval quality, namely precision and recall, each have disadvantages depending on the circumstances. In judging the output of an automatic retrieval procedure it would be useful to have a single number which is practical to compute and summarizes the quality of the output. Such a measure could facilitate the study of retrieval methodology and clarify efforts to optimize retrieval. We show how such a measure may be defined for retrieval methods with ranked output. It is based on the change in the Shannon entropy of the distribution of relevant documents in the database that is produced by the act of retrieval. Properties of this new measure are derived and reveal that it conforms to intuitive expectations.

论文关键词:Information retrieval,information theory,performance measure,recall,precision,relevance

论文评审过程:Received 8 May 1991, Available online 17 June 2003.

论文官网地址:https://doi.org/10.1016/0306-4379(92)90019-J