Identifying and improving retrieval for procedural questions

作者:

Highlights:

摘要

People use questions to elicit information from other people in their everyday lives and yet the most common method of obtaining information from a search engine is by posing keywords. There has been research that suggests users are better at expressing their information needs in natural language, however the vast majority of work to improve document retrieval has focused on queries posed as sets of keywords or Boolean queries. This paper focuses on improving document retrieval for the subset of natural language questions asking about how something is done. We classify questions as asking either for a description of a process or asking for a statement of fact, with better than 90% accuracy. Further we identify non-content features of documents relevant to questions asking about a process. Finally we demonstrate that we can use these features to significantly improve the precision of document retrieval results for questions asking about a process. Our approach, based on exploiting the structure of documents, shows a significant improvement in precision at rank one for questions asking about how something is done.

论文关键词:Procedural questions,Question classification,Document structure,Document clustering,Document retrieval,Reranking

论文评审过程:Received 2 February 2006, Revised 15 May 2006, Accepted 16 May 2006, Available online 11 July 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.05.009