Contextualization models for XML retrieval

作者:

Highlights:

摘要

In a hierarchical XML structure, surrounding elements form the context of an XML element. In document-oriented XML, the context is a part of the semantics of the element and augments its textual information. The process of taking the context of the element into account in element scoring is called contextualization. This study extends the concept of contextualization and presents a classification of contextualization models. In an XML collection, elements are of different granularity, i.e. lower level elements are shorter and carry less textual information. Thus, it seems credible that contextualization interacts differently with diverse elements. Even if it is known that contextualization leads to improved effectiveness in element retrieval, the improvement on different granularity levels has not been investigated. This study explores the effect of contextualization on these levels. Further, a parameterized framework for testing contextualization is presented.The empirical part of the study is carried out in a traditional laboratory setting, where an XML collection is granulated. This is necessary in order to measure performance separately at different hierarchy levels. The results confirm the effectiveness of contextualization, and show how the elements of different granularities benefit from contextualization.

论文关键词:Contextualization,Evaluation,Granularity level,Granulation,Semi-structured data,Structured documents,Content element,XML

论文评审过程:Received 5 September 2008, Revised 17 February 2011, Accepted 20 February 2011, Available online 21 March 2011.

论文官网地址:https://doi.org/10.1016/j.ipm.2011.02.006