Employing web mining and data fusion to improve weak ad hoc retrieval

作者:

Highlights:

摘要

When a user issues a reasonable query to a retrieval system and obtains no relevant documents, he or she is bound to feel frustrated. We call these weak queries and retrievals. Improving their effectiveness is an important issue for ad hoc retrieval and would be most rewarding for these users. We explain why data fusion of sufficiently dissimilar retrieval lists can improve weak query results and confirm this with experiments using short and medium size queries. To realize sufficiently dissimilar retrieval lists, we propose composing alternate queries through web search and mining, employ them for target retrieval, and combine with the original query retrieval list. Methods of forming web probes from longer queries, including salient term selection and query text window rotation, are investigated. When compared with normal ad hoc retrieval, web assistance and data fusion can more than double the original weak query effectiveness. Other queries can also improve along with weak ones.

论文关键词:Weak query,Robust retrieval,Salient term selection,Web mining,Alternate queries,Data fusion

论文评审过程:Received 27 May 2006, Accepted 25 July 2006, Available online 12 October 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.07.008