Using crowdsourcing for TREC relevance assessment

作者:

Highlights:

摘要

Crowdsourcing has recently gained a lot of attention as a tool for conducting different kinds of relevance evaluations. At a very high level, crowdsourcing describes outsourcing of tasks to a large group of people instead of assigning such tasks to an in-house employee. This crowdsourcing approach makes possible to conduct information retrieval experiments extremely fast, with good results at a low cost.This paper reports on the first attempts to combine crowdsourcing and TREC: our aim is to validate the use of crowdsourcing for relevance assessment. To this aim, we use the Amazon Mechanical Turk crowdsourcing platform to run experiments on TREC data, evaluate the outcomes, and discuss the results. We make emphasis on the experiment design, execution, and quality control to gather useful results, with particular attention to the issue of agreement among assessors. Our position, supported by the experimental results, is that crowdsourcing is a cheap, quick, and reliable alternative for relevance assessment.

论文关键词:IR evaluation,Test collections,Relevance assessment,Crowdsourcing,TREC,Amazon Mechanical Turk,Experimental design

论文评审过程:Received 12 January 2011, Revised 12 January 2012, Accepted 14 January 2012, Available online 9 February 2012.

论文官网地址:https://doi.org/10.1016/j.ipm.2012.01.004