On test collections for adaptive information retrieval

作者：

Highlights：

•

摘要

Traditional Cranfield test collections represent an abstraction of a retrieval task that Sparck Jones calls the “core competency” of retrieval: a task that is necessary, but not sufficient, for user retrieval tasks. The abstraction facilitates research by controlling for (some) sources of variability, thus increasing the power of experiments that compare system effectiveness while reducing their cost. However, even within the highly-abstracted case of the Cranfield paradigm, meta-analysis demonstrates that the user/topic effect is greater than the system effect, so experiments must include a relatively large number of topics to distinguish systems’ effectiveness. The evidence further suggests that changing the abstraction slightly to include just a bit more characterization of the user will result in a dramatic loss of power or increase in cost of retrieval experiments. Defining a new, feasible abstraction for supporting adaptive IR research will require winnowing the list of all possible factors that can affect retrieval behavior to a minimum number of essential factors.

论文关键词：Cranfield,Retrieval evaluation,Test collections

论文评审过程：Received 28 September 2007, Accepted 20 December 2007, Available online 17 March 2008.

论文官网地址：https://doi.org/10.1016/j.ipm.2007.12.011