An efficient index structure for XML based on generalized suffix tree

作者:

Highlights:

摘要

A novel index structure based on the generalized suffix tree (PIGST) is proposed. Combined with post lists, PIGST can answer both structural and content queries. The distinct paths in an XML collection are mapped into strings. The construction algorithm of the PIGST for the path strings is presented based on the modification and improvement of a well-known suffix tree construction algorithm that only requires linear time and space complexity. The query process merely needs m character comparisons for direct containment queries, where m is the length of a query string. An efficient processing method for the indirect containment queries that avoids the inefficient tree traversal operation is also presented. Experiments show that PIGST outperforms earlier approaches.

论文关键词:XML,Information retrieval,Generalized suffix tree (GST),Path index based on generalized suffix tree (PIGST),Query processing

论文评审过程:Received 17 June 2004, Revised 13 May 2005, Accepted 6 October 2005, Available online 9 November 2005.

论文官网地址:https://doi.org/10.1016/j.is.2005.10.001