A study about the future evaluation of Question-Answering systems

作者:

Highlights:

摘要

Evaluation campaigns of Question Answering (QA) systems have contributed to the development of such technologies. These campaigns have promoted some changes oriented to overcome results. However, at this period we see how systems have reached an upper bound, as well as systems are still far away from answering complex questions. In this paper, we overview the main QA evaluations over free text, paying special attention to the changes encouraged at such campaigns. We observe that systems still return a high proportion of incorrect answers and that the changes are almost not included in traditional approaches. Moreover, we analyze QA collections in order to obtain better insights about the main challenges for current QA systems. We detect that QA systems find very difficult to deal with different rewordings in questions and documents, as well as to infer information that is not explicitly mentioned in texts. Based on those observations, we recommend a set of directions for future evaluations, suggesting the application of textual inference and knowledge bases as a way for improving results.

论文关键词:Question Answering,Evaluation campaigns,Validation,Textual inference

论文评审过程:Received 22 December 2016, Revised 29 July 2017, Accepted 8 September 2017, Available online 9 September 2017, Version of Record 18 October 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.09.015