Ultra-succinct representation of ordered trees with applications

作者:

Highlights:

摘要

There exist two well-known succinct representations of ordered trees: BP (balanced parenthesis) (Munro and Raman, 2001) [20] and DFUDS (depth first unary degree sequence) (Benoit et al., 2005) [1]. Both have size 2n+o(n) bits for n-node trees, which asymptotically matches the information-theoretic lower bound. Importantly, many fundamental operations on trees can be done in constant time on the word RAM when using BP or DFUDS, for example finding the parent, the first child, the next sibling, the number of descendants, etc. Although the space needed to store the BP or DFUDS representation of an ordered tree matches the lower bound, this is not optimal when we consider encodings for certain special classes of trees such as trees in which every internal node has exactly two children. In this paper, we introduce a new, conditional entropy for trees called the tree degree entropy, and give a succinct tree representation with matching size. We call such a representation an ultra-succinct data structure. We show how to modify the DFUDS representation to obtain a “compressed DFUDS”, and as a consequence, a tree in which every internal node has exactly two children can be represented in n+o(n) bits. We also describe applications of the compressed DFUDS representation to ultra-succinct compressed suffix trees and labeled trees.

论文关键词:Succinct data structure,Ordered tree,Tree degree entropy,Xbw,Compressed suffix tree

论文评审过程:Received 18 February 2010, Revised 29 August 2011, Accepted 8 September 2011, Available online 14 September 2011.

论文官网地址:https://doi.org/10.1016/j.jcss.2011.09.002