Generating suggestions for queries in the long tail with an inverted index

作者:

Highlights:

摘要

This paper proposes an efficient and effective solution to the problem of choosing the queries to suggest to web search engine users in order to help them in rapidly satisfying their information needs. By exploiting a weak function for assessing the similarity between the current query and the knowledge base built from historical users’ sessions, we re-conduct the suggestion generation phase to the processing of a full-text query over an inverted index. The resulting query recommendation technique is very efficient and scalable, and is less affected by the data-sparsity problem than most state-of-the-art proposals. Thus, it is particularly effective in generating suggestions for rare queries occurring in the long tail of the query popularity distribution. The quality of suggestions generated is assessed by evaluating the effectiveness in forecasting the users’ behavior recorded in historical query logs, and on the basis of the results of a reproducible user study conducted on publicly-available, human-assessed data. The experimental evaluation conducted shows that our proposal remarkably outperforms two other state-of-the-art solutions, and that it can generate useful suggestions even for rare and never seen queries.

论文关键词:Query recommender systems,Efficiency in query suggestion,Data sparsity problem,Effectiveness evaluation metrics

论文评审过程:Received 23 February 2011, Revised 7 July 2011, Accepted 13 July 2011, Available online 9 August 2011.

论文官网地址:https://doi.org/10.1016/j.ipm.2011.07.005