Selection of prefix and postfix word fragments for data compression

作者:

Highlights:

摘要

In this paper a simple algorithm is used for selection of a set of codeable substrings that occur at the front or rear of the words in a textual data base. Since the words are assumed to be non-repeating, the technique is useful for data compression of dictionaries. The time complexity of the algorithm is governed by the associated sorting algorithm and hence is 0 (n log n). It has been applied to three sample data bases, consisting of words selected from street names, authors names, or general written English text. The results show that the substrings at the rear of the words, yield better compression than those at the front. By application of results of an earlier study in compression coding, efficient encoding and decoding procedures are presented for use in on-line transmission of data.

论文关键词:

论文评审过程:Revised 12 December 1977, Available online 13 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(78)90067-5