Automaton meets algebra: A hybrid paradigm for XML stream processing

作者:

Highlights:

摘要

XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data streams. The automata paradigm is naturally suited for pattern recognition on tokenized XML streams, but requires patches for fulfilling the filtering or restructuring functionalities in the XML query language. In contrast, the algebraic paradigm is a well-established technique for processing self-contained tuples. It however does not traditionally support token inputs. The Raindrop framework is the first to accommodate these two paradigms within one algebraic framework, taking advantage of both. This paper describes the overall framework, highlighting in particular three aspects. First, we describe how the tokens and tuples are modeled in one uniform query processing model. Second, we present the query rewriting that switches computations between these two data models. Third, we discuss strategies for the implementation and synchronization of the operators within the framework. We report experimental results that illustrate the unique optimization opportunities offered by this novel framework.

论文关键词:XML stream,XQuery processing,Automata,Algebra

论文评审过程:Received 4 October 2005, Accepted 14 October 2005, Available online 28 November 2005.

论文官网地址:https://doi.org/10.1016/j.datak.2005.10.008