Strength Pareto fitness assignment for pseudo-relevance feedback: application to MEDLINE

作者:Ilyes Khennak, Habiba Drias

摘要

Because of users’ growing utilization of unclear and imprecise keywords when characterizing their information need, it has become necessary to expand their original search queries with additional words that best capture their actual intent. The selection of the terms that are suitable for use as additional words is in general dependent on the degree of relatedness between each candidate expansion term and the query keywords. In this paper, we propose two criteria for evaluating the degree of relatedness between a candidate expansion word and the query keywords: (1) co-occurrence frequency, where more importance is attributed to terms occurring in the largest possible number of documents where the query keywords appear; (2) proximity, where more importance is assigned to terms having a short distance from the query terms within documents. We also employ the strength Pareto fitness assignment in order to satisfy both criteria simultaneously. The results of our numerical experiments on MEDLINE, the online medical information database, show that the proposed approach significantly enhances the retrieval performance as compared to the baseline.

论文关键词:information retrieval, query expansion, pseudorelevance feedback, proximity, multi-objective optimization, Pareto dominance, MEDLINE

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-016-5560-0