A noise-tolerant graphical model for ranking

作者:

Highlights:

摘要

This paper studies how to learn accurate ranking functions from noisy training data for information retrieval. Most previous work on learning to rank assumes that the relevance labels in the training data are reliable. In reality, however, the labels usually contain noise due to the difficulties of relevance judgments and several other reasons. To tackle the problem, in this paper we propose a novel approach to learning to rank, based on a probabilistic graphical model. Considering that the observed label might be noisy, we introduce a new variable to indicate the true label of each instance. We then use a graphical model to capture the joint distribution of the true labels and observed labels given features of documents. The graphical model distinguishes the true labels from observed labels, and is specially designed for ranking in information retrieval. Therefore, it helps to learn a more accurate model from noisy training data. Experiments on a real dataset for web search show that the proposed approach can significantly outperform previous approaches.

论文关键词:Learning to rank,Noisy data,Graphical model

论文评审过程:Received 10 May 2011, Revised 26 November 2011, Accepted 29 November 2011, Available online 27 December 2011.

论文官网地址:https://doi.org/10.1016/j.ipm.2011.11.003