A universal method of information retrieval evaluation: the “missing” link M and the universal IR surface

作者:

Highlights:

摘要

The paper shows that the present evaluation methods in information retrieval (basically recall R and precision P and in some cases fallout F) lack universal comparability in the sense that their values depend on the generality of the IR problem. A solution is given by using all “parts” of the database, including the non-relevant documents and also the not-retrieved documents. It turns out that the solution is given by introducing the measure M being the fraction of the not-retrieved documents that are relevant (hence the “miss” measure). We prove that––independent of the IR problem or of the IR action––the quadruple (P,R,F,M) belongs to a universal IR surface, being the same for all IR-activities. This universality is then exploited by defining a new measure for evaluation in IR allowing for unbiased comparisons of all IR results. We also show that only using one, two or even three measures from the set {P,R,F,M} necessary leads to evaluation measures that are non-universal and hence not capable of comparing different IR situations.

论文关键词:Universal IR surface,Miss measure,Precision,Recall,Fallout,Silence,Evaluation

论文评审过程:Received 20 May 2002, Accepted 11 October 2002, Available online 20 January 2003.

论文官网地址:https://doi.org/10.1016/S0306-4573(02)00094-8