Ordered similarity measures taking into account the rank of documents

作者:

Highlights:

摘要

Indices of similarity are used to quantify the difference between two sets of documents. Usually, they are based on the number of elements that they have in common. Indeed, they are calculated from the results of the intersections or unions of the compared sets. But many studies show that order of presentation of the documents is an important fact to be taken into account, particularly in the case of system's evaluation, which is not the case as far as usual measures are concerned. In this article, we propose a general method for the construction of measures of similarity taking into account the rank of presentation of the document. We will call them Ordered Similarity measures, i.e., measures of OS. Then, we present an experimentation of evaluation used to quantify the filtering impact of a system. This protocol is based on a large scale interrogation of the system and on a comparison of answer sets. We present simultaneously the results of comparisons obtained by a classical measure and by an OS measure. Finally we show how to construct OS measures derived from recall and precision.

论文关键词:Metrics,Similarity measure,Rank,Evaluation,Information retrieval

论文评审过程:Received 21 March 2000, Accepted 26 June 2000, Available online 19 March 2001.

论文官网地址:https://doi.org/10.1016/S0306-4573(00)00040-6