Deterministic regular expressions with back-references

作者:

Highlights:

摘要

Most modern libraries for regular expression matching allow back-references (i.e., repetition operators) that substantially increase expressive power, but also lead to intractability. In order to find a better balance between expressiveness and tractability, we combine these with the notion of determinism for regular expressions used in XML DTDs and XML Schema. This includes the definition of a suitable automaton model, and a generalization of the Glushkov construction. We demonstrate that, compared to their non-deterministic superclass, these deterministic regular expressions with back-references have desirable algorithmic properties (i.e., efficiently solvable membership problem and some decidable problems in static analysis), while, at the same time, their expressive power exceeds that of deterministic regular expressions without back-references.

论文关键词:Deterministic regular expression,Regex,Glushkov automaton

论文评审过程:Received 23 March 2018, Revised 1 December 2018, Accepted 18 April 2019, Available online 25 April 2019, Version of Record 27 June 2019.

论文官网地址:https://doi.org/10.1016/j.jcss.2019.04.001