Linguistic kernels for answer re-ranking in question answering systems

作者:

Highlights:

摘要

Answer selection is the most complex phase of a question answering (QA) system. To solve this task, typical approaches use unsupervised methods such as computing the similarity between query and answer, optionally exploiting advanced syntactic, semantic or logic representations.In this paper, we study supervised discriminative models that learn to select (rank) answers using examples of question and answer pairs. The pair representation is implicitly provided by kernel combinations applied to each of its members. To reduce the burden of large amounts of manual annotation, we represent question and answer pairs by means of powerful generalization methods, exploiting the application of structural kernels to syntactic/semantic structures.We experiment with support vector machines and string kernels, syntactic and shallow semantic tree kernels applied to part-of-speech tag sequences, syntactic parse trees and predicate argument structures on two datasets which we have compiled and made available. Our results on classification of correct and incorrect pairs show that our best model improves the bag-of-words model by 63% on a TREC dataset. Moreover, such a binary classifier, used as a re-ranker, improves the mean reciprocal rank of our baseline QA system by 13%.These findings demonstrate that our method automatically selects an appropriate representation of question–answer relations.

论文关键词:Question answering,Information Retrieval,Kernel methods,Predicate argument structures

论文评审过程:Received 29 March 2009, Revised 2 June 2010, Accepted 7 June 2010, Available online 20 July 2010.

论文官网地址:https://doi.org/10.1016/j.ipm.2010.06.002