BM25t: a BM25 extension for focused information retrieval

作者:Mathias Géry, Christine Largeron

摘要

This paper addresses the integration of XML tags into a term-weighting function for focused XML information retrieval (IR). Our model allows us to consider a certain kind of structural information: tags that represent a logical structure (e.g., title, section, paragraph, etc.) as well as other tags (e.g., bold, italic, center, etc.). We take into account the influence of a tag by estimating the probability for this tag to distinguish relevant terms from the others. Then, these weights are integrated in a term-weighting function. Experiments on a large collection from the INEX 2008 XML IR evaluation campaign showed improvements on focused XML retrieval.

论文关键词:Probabilistic information retrieval model, Structured information retrieval, XML, Tags, Weighting scheme, BM25

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-011-0426-0