Nonlinear ranking function representations in genetic programming-based ranking discovery for personalized search

作者:

Highlights:

摘要

Ranking function is instrumental in affecting the performance of a search engine. Designing and optimizing a search engine's ranking function remains a daunting task for computer and information scientists. Recently, genetic programming (GP), a machine learning technique based on evolutionary theory, has shown promise in tackling this very difficult problem. Ranking functions discovered by GP have been found to be significantly better than many of the other existing ranking functions. However, current GP implementations for ranking function discovery are all designed utilizing the Vector Space model in which the same term weighting strategy is applied to all terms in a document. This may not be an ideal representation scheme at the individual query level considering the fact that many query terms should play different roles in the final ranking. In this paper, we propose a novel nonlinear ranking function representation scheme and compare this new design to the well-known Vector Space model. We theoretically show that the new representation scheme subsumes the traditional Vector Space model representation scheme as a special case and hence allows for additional flexibility in term weighting. We test the new representation scheme with the GP-based discovery framework in a personalized search (information routing) context using a TREC web corpus. The experimental results show that the new ranking function representation design outperforms the traditional Vector Space model for GP-based ranking function discovery.

论文关键词:Information routing,Information retrieval,Genetic programming,Ranking function

论文评审过程:Received 14 October 2004, Revised 24 October 2005, Accepted 3 November 2005, Available online 7 December 2005.

论文官网地址:https://doi.org/10.1016/j.dss.2005.11.002