Passage extraction and result combination for genomics information retrieval

作者:Qinmin Hu, Jimmy Xiangji Huang

摘要

In this paper, we first propose algorithms for passage extraction to build indices for the purpose of generating more accurate passages as query answers. Second, we propose a basic result combination method and an improved result combination method to combine the retrieved results from different indices for the purpose of selecting and merging relevant passages as outputs. For passage extraction, three new algorithms are proposed, namely paragraphParsed, sentenceParsed and wordSentenceParsed. For result combination, a novel method is proposed, in which we use factor analysis to generate a better baseline result for combination by finding some hidden common factors that can be used to estimate the importance of keywords and keyword associations. Finally, we report the experimental results that confirm the effectiveness and superiority of the factor analysis based method for result combination. Our proposed approaches achieve excellent results on the TREC 2006 and 2007 Genomics data sets, which provide a promising avenue for constructing high performance information retrieval systems in biomedicine.

论文关键词:Information retrieval, Passage extraction, Result combination, Linear regression, Factor analysis, Genomics

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10844-009-0097-4